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1 2 

ARRAYS OF NUCLEIC ACID PROBES ON previously characterized sequence or reference sequence. 

BIOLOGICAL CHIPS The methods of the invention can be used to detect varia- 
tions between a target and reference sequence, including 

CROSS-REFERENCE TO RELATED single or multiple base substitutions, and deletions and 

APPLICATION 5 insertions of bases, as well as detecting the presence, 

location, and sequence of other more complex variations 

This is a Continuation of application Ser. No. 08/143,312, between a target and reference sequence in a nucleic acid, 

filed Oct. 26, 1993, now abandoned, which is a continuation ^ presenl invention provides arrays of oligonucleotide 

in part of U.S. patent application Ser. No. 082,937, filed 25 probes j mmob ili2ed on a solid support. The arrays are 

Jun. 1993, now abandoned, incorporated herein by refer- ]q preferably synthesized directly on the support using 

ence - VLSIPS™ technology, but other synthesis methods and 

Research leading to the invention was funded in part by immobilization of pre-synthesized oligonucleotide probes 

NIH grant No. 1R01HG00813-01 and DOE grant No. can be used to make the oligonucleotide probe arrays, called 

DE-FG03-92-ER81275, and the government may have cer- "DNA chips", of the invention. In general, these arrays 

tain rights to the invention. comprise a set of oligonucleotide probes such that, for each 

15 base in a specific reference sequence, the set includes a 

BACKGROUND OF THE INVENTION probe (called the "wild-type" or " WT" probe) that is exactly 

complementary to a section of the reference sequence 

1. Field of the Invention including the base of interest and four additional probes 
The present invention provides arrays of oligonucleotide (called "substitution probes"), which are identical to the WT 

probes immobilized in microfabricated patterns on silica 20 probe except that the base of interest has been replaced by 

chips for analyzing molecular interactions of biological one of a predetermined set (typically 4) of nucleotides. In the 

interest. The invention therefore relates to diverse fields preferred embodiment, one of the four substitution probes is 

impacted by the nature of molecular interaction, including identical to the wild type probe; the other three are comple- 

chemistry, biology, medicine, and medical diagnostics. mentary to targets that have a single-base substitution at this 

2. Description of Related Art 25 po f ltioD * L , . 

_.. . 1° another aspect, the invention relates to the arrangement 

Oligonucleotide probes have long been used to detect of individual probes m the arra „ i n 0 ne embodiment, the 

complementary nucleic acid sequences in a nucleic acid of probes m arranged on the chip so that probes for a given 

interest (the "target" nucleic acid). In some assay formats, position in the sequence are adjacent, and probes for adja- 
the oligonucleotide probe is tethered, i.e., by covalent 3Q cent positions in the reference sequence are also adjacent to 

attachment, to a solid support, and arrays of oligonucleotide one another on the chip. One method arranges the probes for 

probes immobilized on solid supports have been used to a single base in a short column (alternately row) and 

detect specific nucleic acid sequences in a target nucleic arranges the columns in the order of the base position to 

acid. See, e.g., PCT patent publication Nos. WO 89/10977 form horizontal (alternately vertical) stripes. The wild-type 

and 89/11548. Others have proposed the use of large num- and each of the substitution probes have specified positions 

bers of oligonucleotide probes to provide the complete within the column so that all the probes corresponding to an 

nucleic acid sequence of a target nucleic but failed to A substitution, for example, are in a single row. The stripes 

provide an enabling method for using arrays of immobilized mav be separated on the chip by a blank row or column. 

. probes for this purpose. See U.S. Pat. Nos. 5,202,231 and Th e DNA ctu *P s of tne invention can be made in a wide 

5,002,867 and PCT patent publication No. WO 93/17126 number of variations. For some applications, leaving out the 

Hie development of VLSIPS™ technology has provided *° wU f^P e row » ! t eavia S ™ l ^?otUnt bases, rx>oUng bases 
„ 4 u „j„ *u i* i * i- i *-j including insertion and deletion probes, varying the length 

memoes tor maiong very large arrays 01 oligonucleotide of , he 

a set to make the probes have the same 
probes in very smaU arrays. See U.S J?at No. 5.143£54and Q[ ^ Tm re , ative ^ £ {(J d 

PCT patent publication Nos WO 90/15070 and 92/10092, sl y ^ ^ muUtion I8 position> ^ muhi ^ 

each of which is mcorporated herein by reference. US. 4S bes fof , ^ mutati ^ replicate probes or 

patent application Ser. No 082,937, filed Jun. 25 1993, » , ad ^ ..^J {m ^ 

describes methods for maiong arrays of oligonucleotide ^ 0f jndividua] „ ^ d ^ ^ 

probes that can be used to provide the complete sequence of ^ e a0 p r0Dr j ate 

a target nucleic acid and to detect the presence of a nucleic Jt . . ^ urA L . - , 

««-^T««t«:«; - «i «-j The present mvention also provides DNA chips for detect- 

acid contammg a specific nucleotide sequence. c n . r t . . A , K. . . F . . 

■ ... f . . , , 7 - 50 mg mutations assoaated with cystic fibrosis, including 

MiciofAn^d m^ ofl^ numbers of ohgonucle- mutations m exons 4 7 9 10f Uf 20 , and 21 of the CFTR 

oude probes, called DNA chips offer great promise for a ^ inventk)n alsQ ides DNA cW fof detectin 

wide variety of applications. New methods and reagents are mutations in the p53 genCj a in which mutations a * 

required to realize this promise, and the present mvention ]aom tQ be ^ a ^ variely of cancers . other 

helps meet that need. 55 DNA chi p S of ^ invention provide probe arrays for detect- 

SUMMARY OF THE INVENTION m S SP 60 ^ 0 sequences of mitochondrial DNA, useful for 

identification and forensic purposes. The invention also 

Hie present invention provides methods for making high- provides DNA chips for detecting specific sequences of 

density arrays of oligonucleotide probes on silica chips and nucleotides or mutations associated with the acquisition of a 

for using those probe arrays to detect specific nucleic acid 60 drug res istant phenotype in an infectious organism, such as 

sequences contained in a target nucleic acid in a sample. The rifampicin or other drug resistant TB strains and HIV, in 

mvention also provides arrays of oligonucleotide probes on wnicn mu tations in an RNA polymerase gene are known to 

DNA chips, in which the probes have specific sequences and gj ve ^ x l0 resistance, 
locations in the array to facilitate identification of a specific 

target nucleic acid. In another aspect, the invention provides 65 BRIEF DESCRIPTION OF THE DRAWINGS 

methods for detecting whether one or more specific FIG. 1 shows how the tiling method of the invention 

sequences of a target nucleic acid in a sample varies from a defines a set of DNA probes relative to a target nucleic acid. 
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In the figure, the target is a DNA molecule, the probes are from the genomic DNA of an individual with wild-type 

single-stranded nucleic acids 16 nucleotides in length, and AF508 sequences; in panel B, the target nucleic acid origi- 

only a portion of the probes defined by the method is shown. nated from a heterozygous (with respect to the AF508 

FIG. 2 shows an illustrative tiled array of the invention mutation) individual, 

with probes for the detection of point mutations. The base at 5 FIG. 8, in sheets 1 and 2, corresponding to panels A and 

the position of substitution in each of the wild-type probes B of FIG. 7, shows graphs of fluorescence intensity versus 

is shown in the wild-type lane, and the shading shows the tiling position. The labels on the horizontal axis show the 

location of the substitution probe having the wild-type bases in the wild-type sequence corresponding to the posi- 

sequence. The SEQ ID. NOS. corresponding to the two tion of substitution in the respective probes. Plotted are the 

peptide sequences shown in the top portion of FIG. 2 are 311 10 intensities observed from the features (or synthesis sites) 

and 312, respectively. The SEQ ID. NOS. corresponding to containing wild-type probes, the features containing the 

the five peptide sequences listed at the bottom of FIG. 2 are substitution probes that bound the most target ("called"), and 

313, 314, 315, 313, and 316, respectively. the feature containing the substitution probes that bound the 

FIG. 3, in panels A, B, and C, shows an image made from tar £e* with second highest intensity of all the substitution 

the region of a DNA chip containing CFTR exon 10 probes; 15 P roDes (" 2nd Highest"). The SEQ ID NOS. corresponding to 

in panel A, the chip was hybridized to a wild-type target; in the two P e P tide sequences shown in sheet 2 of FIG. 8 are 332 

panel C, the chip was hybridized to a mutant AF508 target; and 318 > respectively. 

and in panel B, the chip was hybridized to a mixture of the FIG. 9 shows the human mitochondrial genome; "0 H " is 

wild-type and mutant targets. The SEQ ID. NOS. corre- the H strand origin of replication, and arrows indicate the 

sponding to the four peptide sequences shown in FIG. 3 are 20 cloned unshaded sequence. 

317-320, respectively. FIG. 10 shows the image observed from application of a 

FIG. 4, in sheets 1-3, corresponding to panels A, B, and sample of mitochondrial DNA derived nucleic acid (from 

C of FIG. 3, shows graphs of fluorescence intensity versus the mt4 sample) on a DNA chip. 

tiling position. The labels on the horizontal axis show the 25 FIG. 11 is similar to FIG. 10 but shows the image 

bases in the wild- type sequence corresponding to the posi- observed from the mt5 sample. 

tion of substitution in the respective probes. Plotted are the FIG. 12 shows the predicted difference image between the 

intensities observed from the features (or synthesis sites) mt4 and mt5 samples on the DNA chip based on mismatches 

containing wild-type probes, the features containing the between the two samples and the reference sequence, 

substitution probes that bound the most target ("called"), and 3Q nQ 13 shows the actual difference image observed for 

the feature containing the substitution probes that bound the ^ m ^ mt ^ samples 

^k'^ ^^i^«n^^ teMbSdl, ^ n nG - *• sheets 1 and V«howi a plot of normalized 

probes ("2nd Highest ). The SEQ ID. NOS. corresponding across rows 10 and u of me J^y and a tabula . : 

to the two pepude sequences shown in sheet 1 of FIG. 4 are ^ 

321 and 318, respectively; the SEQ ID. NOS. corresponding „ „„ 

to the two pepude sequences shown in sheet 2 of FIG. 4 are 35 . nG ' " *P ws J h ? ^"T? '^^^d-type and 

322 and 318, respectively; and the SEQ ID. NOS. corre- mutant hybrids obtained with the chip. A median of the six 

sponding to the two peptide sequences shown in sheet 3 of normalized hybridization scores for each probe was taken; 

FIG. 4 are 323 and 318, respectively. the graph plots the ratio of the median score to the normal- 

, . , . „ . _, . . , ized hybridization score versus mean counts. A ratio of 1.6 

FIG. 5, m panels A B, and C, shows an unage made from ^ and ^ coun(s aboye 50 ^ no &]>e ;tiyes 

a region of a DNA chip containing CF1K exon 10 probes; _ .„ t . . r l L • L 
in panel A, the chip was hybridized to the wt480 target; in nG n 16 ^trates how the identity of the base rmsmatch 
panel C, the chip was hybridized to the mu480 target; and in ma y influence tne abmt y t0 discnminate mutant and wild- 
panel B, the chip was hybridized to a mixture of the type sequences more than the position of the mismatch 
wild-type and mutant targets. The SEQ ID. NOS. corre- 4S " 'oligonucleotide probe The mismatch ration is 
sponding to the peptide sequences shown in FIG. 5 are expressed as % of probe length from the 3 -end. The base 
324-327, respectively. chan 8 e 15 md,cated 00 the S ra P h - 

HG. 6, in sheets 1-3, corresponding to panels A, B, and FIG - " provides a 5' to 3' sequence listing of one target 

C of HG. 5, shows graphs of fluorescence intensity versus corresponding to the probes on the chip. X is a control probe 

tiling position. The labels on the horizontal axis show the 50 ^ slll0D i s m ,he (,e ', a " ^ ^cnTn 

bases in the wild-type sequence corresponding to the posi- ^P robe at the designated site) are in bold. The SEQ D. 

tion of substitution in the respective probes. Plotted are the JL° corresponding to the peptide sequence shown in FIG. 

intensities observed from the features (or synthesis sites) 15 

containing wild-type probes, the features containing the FIG. 18 shows the fluorescence image produced by scan- 
substitution probes that bound the most target ("called"), and 55 nin S the chi P described in nG - 17 when hybridized to a 
the feature containing the substitution probes that bound the sample. 

target with the second highest intensity of all the substitution FIG. 19 illustrates the detection of 4 transitions in the 

probes ("2nd Highest"). The SEQ ID. NOS. corresponding target sequence relative to the wild-type probes on the chip 

to the two peptide sequences shown in sheet 1 of FIG. 6 are in FIG. 18. 

328 and 329, respectively; the SEQ ID. NOS. corresponding eo FIG. 20 shows the alignment of some of the probes on a 

to the two peptide sequences shown in sheet 2 of FIG. 6 are p 53 DNA chip with a 12-mer model target nucleic acid. The 

330 and 329, respectively; and the SEQ ID. NOS. corre- SEQ ID. NOS. corresponding to the fourteen peptide 

sponding to the two peptide sequences shown in sheet 3 of sequences shown in FIG. 20 are 334-347, respectively. 

FIG. 6 are 331 and 329, respectively. FIG. 21 shows a set of 10-mer probes for a p53 exon 6 

FIG. 7, in panels A and B, shows an image made from a 65 DNA chip. The SEQ ID. NOS. corresponding to the thirteen 

region of a DNA chip containing CFTR exon 10 probes; in peptide sequences shown in FIG. 21 are 334 and 348-359, 

panel A, the chip was hybridized to nucleic acid derived respectively. 
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FIG. 22 shows that very distinct patterns are observed in the nucleotide sequence of a target nucleic acid with 
after hybridization of p53 DNA chips with targets having oligonucleotide probes of defined length. The length (L) of 
different 1 base substitutions. In the first image in FIG. 22, the probe is typically expressed as the number of nucleotides 
the 12-mer probes that form perfect matches with the or bases in a single-stranded nucleic acid probe. For pur- 
wild-type target are in the first row (top). The 12-mer probes 5 poses of the present invention, lengths ranging from 12 to 18 
with single base mismatches are located in the second, third, bases are preferred, although shorter and longer lengths can 
and fourth rows and have much lower signals. also be employed. To employ the tiling method, one syn- 

FIG. 23, in graphs 2, 3, and 4, graphically depicts the data Resizes a set of probes defined by the particular nucleotide 

in FIG. 22. On each graph, the X ordinate is the position of sequence of interest in the target nucleic acid. For each base 

the probe in its row on the chip, and the Y ordinate is the io ,n the tar S et DNAsegment, one synthesizes a probe comple- 

signal at that probe site after hybridization. mentary to the subsequence of the target nucleic acid begin- 

FIG. 24 shows the results of hybridizing mixed target Eg * that base and eadin S L ' X bases t0 the 3 '* side ( see 

populations of WT and mutant p53 genes to the p53 DNA '* 

cn ]p In a preferred embodiment of the invention, the probes are 

FIG. 25, in graphs 1^1, shows (see FIG. 23 as well) the 15 < ei * er b * ^mobilization typically by covalent 

hybridization efficiency of a 10-mer probe array as com- atta u ctoent ; of a pre-synthesized probe or by synthesis of the 

pared to a 12-mer probe array P robe 00 me ^ bst ^) 00 the substrate or chips in lanes 

mr^ it u • e M nxr* i_- i_ t_ -j- j stretching across the chip and separated, and these lanes are 

a t et DNA WS " " to '"nied ranged in blocks of preferably 5 lanes, although 
_JL -_..*.. , ,.20 blocks of other sizes will have useful application, as will be 

FIG. 27 ^illustrates how the ; actual sequence was read from m ^ , he foUowing ii lus , ration . first 0 f , hese 

the dup shown in FIG. 26. Gaps in the sequence of letters fivc lints> ^ , he Uwild . type Iine » mBtaim probes 

m the WT rows correspond to control probes or sites. ^ to ofder of teqfKaee ^ and all of the probes are 

Posmons at which bases are mooted are represented by complememary t0 , specified wi i d . type nucleic acid 

lettersin !tahc type in cells correspondmg to probes m which 25 other four lanes contain 5e ^ for detect . 

the\^baseshavebeensubstitutedbyotherbases.TheSEQ ■ a[ , ibk single . base nutations in the defined 

FIG 21 C °7 0 Sp ° nding ,0 * e pCptlde SeqUeDCe Sh0WD 1Q sequence; in turn, these probe sets are defined by a position 

of potential non-complementarity in the probe relative to the 
FIG. 28 illustrates the VLSIPS™ technology as applied to larget (i ^ a single base mismatch ) and the identity of the 
the Ughtdirectedsynthesisofohgonucleotides.Ught(hv)is 30 nucleotide in the probe at that position (i.e., whether the 
shone through a mask (M,) to activate functional , groups nucleotide is an A, C, G, or T nucleotide). The position of 
( OH) on a surface by removal of a protecting group (X). mismatch, also called the position of substitution, is prefer- 
Nucleoside building blocks protected with photoremovable ably selected to be near the center of the probes, i.e., position 
protecting groups (T-X, C-X) are coupled to the activated 7 of a probe of L»15 

areas By repeating the irradiation and coupling steps, very 35 For each be in the lane , one synthesizes four 

complex arrays of oligonucleotides can be prepared. probes (one for each of {he lanes Qther than tfae 

FIG. 29 illustrates how the VLSIPS™ process can be used lane)j of these four probes is identical t0 the co^. 

to prepare "nucleoside combmatonals" or oligonucleotides sponding wild-type probe but for the base at the position of 
synthesized by coupling ail four nucleosides to form dimers, substitution, and the remaining probe is identical to the 
trimers, etc. ^ wiid.type probe. This set of four substitution probes is 

FIG. 30 shows the deprotection, coupling, and oxidation preferably placed in a column directly below (or above) the 
steps of a solid phase DNA synthesis method. corresponding wild-type probe, thus creating an A-lane, a 

FIG. 31 shows an illustrative synthesis route for the C-lane, a G-lane, and a T-lane. FIG. 2 shows an illustrative 
nucleoside building blocks used in the VLSIPS™ method. tiled array of the invention with probes for the detection of 
FIG. 32 shows a preferred photoremovable protecting 45 point mutations. The base at the position of substitution in 
group, MeNPOC, and how to prepare the group in active each of the wild-type probes is shown in the wild -type lane, 
form. and the shading shows the location of the substitution probe 

FIG. 33 illustrates an illustrative detection system for having the wild-type sequence. Below are the probes that 
scanning a DNA chip. would be placed in the column marked by the arrow if the . 

50 probe length were 15 and the position of substitution were 
7. 

3'-CCGACTGCAGTCGTT (SEQ. ID. NO:l) 
Using the VLSIPS™ method, one can synthesize arrays 3-CCGACTACAGTCGTT (SEQ. ID. NO:2) 
of many thousands of oligonucleotide probes on a substrate, 3'-CCGACTCCAGTCGTT (SEQ. ID. NO: 3) 
such as a glass slide or chip. The method can be used, for 55 3-CCGACTGCAGTCGTT (SEQ. ID. NO:l) 
instance, to synthesize "combinatorial" arrays consisting of, 3'-CCGACTTCAGTCGTT (SEQ. ID. NO:4) 
for example, all possible octanucleotides. Such arrays can be Thus, the substitution lanes occupy four of the five lanes 
used for primary sequencing-by-hybridization on genomic separating successive wild-type lanes on the chip; the blocks 
DNA fragments or other nucleic acids or to detect mutations of five lanes can be separated by a sixth lane for measure- 
in a target nucleic acid for which the normal or "wild-type" 60 ment of background signals. 

nucleotide sequence is already known. Using the preferred The DNA chips of the invention have a wide variety of 
method of the invention, one employs a strategy called applications. In one embodiment, the DNA chip is used to 
"tiling" to synthesize specific sets of probes or at spatially- select an optimal probe from an array of probes. In this 
defined locations on a substrate, creating the novel probe embodiment, an array of probes of variable length and 
arrays and "DNA chips" of the invention. 65 sequences is synthesized and then hybridized to a target 

To illustrate the tiling method of the invention, consider nucleic acid of known sequence. The pattern of hybridiza- 
the problem of detecting mutations at one or more position lion reveals the optimal length and sequence composition of 
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probes to detect a particular mutation or other specific base substitution and any deletion within the 192-base exon, 

sequence of nucleotides. In some circumstances, i.e., target including the three-base deletion known as AF508. As 

nucleic acids with repeated sequences or with high G/C described in detail below, hybridization of sub-nanomolar 

content, very long probes may be required for optimal concentrations of wild-type and AF508 oligonucleotide tar- 
detection. In one embodiment for detecting specific 5 g et nucleic acids labeled with fluorescein to these arrays 

sequences in a target nucleic acid with a DNA chip, repeat produces highly specific signals (detected with confocal 

. sequences are detected as follows. The chip comprises scanning fluorescence microscopy) that permit discrimina- 

probes of length sufl&cient to extend into the repeat region lion between mulant and wild-type target sequences in both 

varying distances from each end. The sample, prior to homozygous and heterozygous cases. The method and chips 
hybridization, is treated with a labeled oligonucleotide that 10 of lhe . mention canalso be used to detect other known 

is complementary to a repeat region but shorter than the full mu J? t ( ! oas m < hc CFFR described m detail below, 

length of the repeat. Tne target nucleic is labeled with a AF T^ Z^^L^n^ 

ca ° A , u i aa u u -j- *• .1. • AF508, because the mutation is a three-base deletion that 

second dastmct label After hybndiaiion the chip is resulls ^ the remova , of amin(J acjd ^ [mm [he ^ 

scannea tor prpoes tnat nave bound both the labeled target prote u,. The 

present invention provides DNA chips for 
and the labeled oligonucleotide probe; the presence of such 15 detecting AF508, one such chip results from applying the 
bound probes shows that at least two repeat sequences are tiling melhod t0 exon 10 of lhe CFTR genej me ex0Q t0 
present. which AF508 has been mapped. The tiling method involved 

A variety of methods can be used to enhance detection of the synthesis of a set of probes of a selected length in the 
labeled targets bound to a probe on the array. In one range of from 10 to 18 bases and complementary to subse- 
embodiment, the protein MutS (from E. coll) or equivalent 20 quences of the known wild-type CFTR sequence starting at 
proteins such as yeast MSH1, MSH2, and MSH3; mouse a position a few bases into the intron on the 5'-side of exon 
Rep-3, and Streptococcus Hex-A, is used in conjunction 10 and ending a few bases into the intron on the 3'-side. 
with target hybridization to detect probe-target complex that There was a probe for each possible subsequence of the 
contain mismatched base pairs. The protein, labeled directly given segment of the gene, and the probes were organized 
or indirectly, can be added to the chip during or after 25 into a "lane" in such a way that traversing the lane from the 
hybridization of target nucleic acid, and differentially binds upper left-hand corner of the chip to the lower righthand 
to homo- and heteroduplex nucleic acid. A wide variety of corner corresponded to traversing the gene segment base- 
dyes and other labels can be used for similar purposes. For by-base from the 5 f -end. The lane containing that set of 
instance, the dye YOYO-1 is known to bind preferentially to probes is, as noted above, called the "wild-type lane." 
nucleic acids containing sequences comprising runs of 3 or 30 Relative to the wild-type lane, a "substitution" lane, called 
more G residues. the "A-lane", was synthesized on the chip. The A-lane 

The DNA chips produced by the methods of the invention probes were identical in sequence to an adjacent 
can be used to study arid detect mutations in exons of human (immediately below the corresponding) wild-type probe but 
genes of clinical interest, including point mutations and contained, regardless of the sequence of the wild-type probe, 
deletions. In the following sections, the method of the 35 a dA residue at position 7 (counting from the 3*-end). In 
invention is illustrated by the detection of mutations in a similar fashion, substitution lanes with replacement bases 
variety of clinically and medically significant human nucleic dC, dG, and dT were placed onto the chip in a "C-lane ," a 
acid sequences. Thus, the invention is illustrated first with "G-lane," and a "T-lane," respectively. A sixth lane on the 
respect to the preparation of DNA chips for the detection of chip consisted of probes identical to those in the wild-type 
mutations associated with cystic fibrosis, then with DNA 40 lane but for the deletion of the base in position 7 and 
chips for the detection of human mitochondrial DNA restoration of the original probe length by addition to the 
sequences, then with DNA chips for the detection of muta- 5'-end the base complementary to the gene at that position, 
tions in the human p53 gene associated with cancer, and The four substitution lanes enable one to deduce the 
finally with respect to the detection of mutations in the HIV sequence of a target exon 10 nucleic acid from the relative 
RT gene associated with drug resistance. 45 intensities with which the target hybridizes to the probes in 

Detection of Cystic Fibrosis Mutations with DNA Chips the various lanes. The probe organization on the chip can be 
A number of years ago, cystic fibrosis, the most common conveniently columnar, and the set of probes consisting of a 
severe autosomal recessive disorder in humans, was shown wild-type probe and four corresponding substitution probes 
to be associated with mutations in a gene thereafter named is referred to as a "column set." One and only one of the four 
the Cystic Fibrosis Transmembrane Conductance Regulator 50 substitution probes in a column set has exactly the same 
(CFTR) gene. The sequences of the exons and parts of the sequence as the wild-type probe in the set. Those of skill in 
introns in the gene are known, as are the changes corre- the art will appreciate that, in other embodiments of the 
sponding to several hundred known mutations. Several tests invention, one could delete one or more lanes or columns 
have been developed for detecting the most frequent of these and still benefit from the invention. Various versions of such 
mutations. The present invention provides CFTR gene oli- 55 exon 10 DNA chips were made as described above with 
gonucleotide arrays (DNA chips) that can be used to identify probes 15 bases long, as well as chips with probes 10, 14, 
mutations in the CFTR gene rapidly and efficiently. and 18 bases long. For the results described below, the 

The methods used to make the high-density DNA chips of probes were 15 bases long, and the position of substitution 
the invention allow probes for long stretches of DNA coding was 7 from the 3'-end. 

regions to be directly "written" onto the chips in the form of 60 To demonstrate the ability of the chip to distinguish the 
sets of overlapping oligonucleotides. These methods have AF508 mutation from the wild-type, two synthetic target 
been used to develop a number of useful CFTR gene chips, nucleic acids were made. The first, a 39-mer complementary 
one illustrative chip bears an array of 1296 probes covering to a subsequence of exon 10 of the CFTR gene having the 
the full length of exon 10 of the CFTR gene arranged in a three bases involved in the AF508 mutation near its center, 
36x36 array of 356 Am elements. The probes in the array can 65 is called the "wild-type" or wt508 target, corresponds to 
have any length, preferably in the range of from 10 to 18 positions 111-149 of the exon, and has the sequence shown 
residues and can be used to detect and sequence any single- below: 
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5 '-CATTAAAG AAAATAXC ATCTTTG GTGTTTCCTAT- whose point of substitution corresponds to the T at the 3'-end 
GATGA (SEQ. ID NO: 5). of the deletion was very close to background. Following that 

The second, a 36-mer probe derived from the wild-type pattern, the wild-type probe whose point of substitution 
target by removing those same three bases, is called the corresponds to the middle base (also a T) of the deletion 
"mutant" target or mu508 target and has the sequence shown 5 bound still less target. However, the probe in the T-lane of 
below, first with dashes to indicate the deleted bases, and that column set bound the target very well. . 
then without dashes but with one base underlined (to indi- Examination of the sequences of the two targets reveals 
cate the base detected by the T-lane probe, as discussed that the deletion places an A at that position when the 
below): sequences are aligned at their 3'-ends and that the T-lane 

5'-CATTAAAGAAAATATCAT to probe is complementary to the mutant target with but two 

TGGTG 1T1 CCTATGATGA; (SEQ. ID NO:6) mismatches near an end (shown below in lower-case letters, 

S'-CAJTAAAGAAAATATCAITGGTGTrTCCTATGATGA. with the position of substitution underlined): 

(SEQ. ID NO:7) Target: 5 * - C ATTA A A G A A A ATAT C ATT G G TG T- 

Both targets were labeled with fluorescein at the 5'-end. TTCCTATGATGA 

In three separate experiments, the wild-type target, the 15 Probe: 3'-TagTAGTAACCACAA (SEQ. ID NO:8) 
mutant target, and an equimolar mixture of both targets was Thus the T-lane probe in that column set calls the correct 
exposed (0.1 nM wt508, 0.1 nM mu508, and 0.1 nM wt508 base from the mutant sequence. Note that, in the graph for 
plus 0.1 nM mu508, respectively, in a solution compatible the equimolar mixture of the two targets, that T-lane probe 
with nucleic acid hybridization) to a CF chip. The hybrid- binds almost as much target as does the A-lane probe in the 
ization mixture was incubated overnight at room 20 same column set, whereas in the other column sets, the 
temperature, and then the chip was scanned on a reader (a probes that do not have wild-type sequence do not bind 
confocal fluorescence microscope in photon-counting mode; target at all as well. Thus, that one column set, and in 
images of the chip were constructed from the photon counts) particular the T-lane probe within that set, detects the AF508 
at several successively higher temperatures while still in mutation under conditions that simulate the homozygous 
contact with the target solution. After each temperature 25 case and also conditions that simulate the heterozygous case, 
change, the chip was allowed to equilibrate for approxi- The present invention thus provides individual probes, 

mately one-half hour before being scanned. After each set of sets of probes, and arrays of probe sets on chips, in specific 

scans, the chip was exposed to denaturing solvent and patterns, as the probes provide important benefits for detect- 
conditions to wash, i.e., remove target that had bound, the ing the presence of specific exon 10 sequences. The 
chip so that the next experiment could be done with a clean 30 sequences of several important probes of the invention are 

chip. shown below. In each case, the letter "X" stands for the point 

The results of the experiments are shown in FIGS. 3, 4, 5, of substitution in a given column set, so each of the 

and 6. FIG. 3, in panels A, B, and C, shows an image made sequences actually represents four probes, with A, C, G, and 

from the region of a DNA chip containing CFTR exon 10 T, respectively, taking the place of the "X." Sets of shorter 

probes; in panel A, the chip was hybridized to a wild-type 35 probes derived from the sets shown below by removing up 

target; in panel C, the chip was hybridized to a mutant delta to five bases from the 5 '-end of each probe and sets of longer 

508 target; and in panel B, the chip was hybridized to a probes made from this set by adding up to three bases from 

mixture of the wild-type and mutant targets; FIG. 4, in sheets the exon 10 sequence to the 5 f -end of each probe, are also 

1-3, corresponding to panels A, B, and C of FIG. 3, shows useful and provided by the invention, 

graphs of fluorescence intensity versus tiling position. The 40 3'-TTTATAXTAGAAACC (SEQ. ID NO:9) 

labels on the horizontal axis show the bases in the wild-type 3'-TTATAGXAGAAACCA (SEQ. ID NO:10) 

sequence corresponding to the position of substitution in the 3'-TATAGTXGAAACCAC (SEQ. ID NO:ll) 

respective probes. Plotted are the intensities observed from 3'-AXAGTAXAAACCACA (SEQ. ID NO:12) 

the features (or synthesis sites) containing wild-type probes, 3'-TAGTAGXAACCACAA (SEQ. ID NO:13) 

the features containing the substitution probes that bound the 45 3'-AGTAGAXACCACAAA (SEQ. ID NO:14) 

most target ("called"), and the feature containing the sub- 3'-GTAGAAXCCACAAAG (SEQ. ID NO:15) 

stitution probes that bound the target with the second highest 3'-TAGAAAXCACAAAGG (SEQ. ID NO: 16) 

intensity of all the substitution probes ("2nd Highest"). 3-AGAAACXACAAAGGA (SEQ. ID NO: 17) 

These figures show that, for the wild-type target and the Although in this example the sequence could not be 

equimolar mixture of targets, the substitution probe with a 50 reliably deduced near the ends of the target, where there is 

nucleotide sequence identical to the corresponding wild- not enough overlap between target and probe to allow 

type probe bound the most target, allowing for an unam- effective hybridization, and around the center of the target, 

biguous assignment of target sequence as shown by letters where hybridization was weak for some other reason, per- 

near the points on the curve. The target wt508 thus hybrid- haps high AT-content, the results show the method and the 

ized to the probes in the wild-type lane of the chip, although 55 probes of the invention can be used to detect the mutation of 

the strength of the hybridization varied from probe-to-probe, interest. The mutant target gave a pattern of hybridization 

probably due to differences in melting temperature. The that was very similar to that of the wt508 target at the ends, 

sequence of most of the target can thus be read directly from where the two share a common sequence, and very different 

the chip, by inference from the pattern of hybridization in in the middle, where the deletion is located. As one scans the 

the lanes of substitution probes (if the target hybridizes most 60 image from right to left, the intensity of hybridization of the 

intensely to the probe in the A-lane, then one infers that the target to the probes in the wild-type lane drops off much 

target has a T in the position of substitution, and so on). more rapidly near the center of the image for mu508 than for 

For the mutant target, the sequence could similarly be wt508; in addition, there is one probe in the T-lane that 

called on the 3'-side of the deletion. However, the intensity hybridizes intensely with mu508 and hardly at all with 

of binding declined precipitously as the point of substitution 65 wt508. The results from the equimolar mixture of the two 

approached the site of the deletion from the 3'-end of the targets, which represents the case one would encounter in 

target, so that the binding intensity on the wild-type probe testing a heterozygous individual for the mutation, are a 
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blend of the results for the separate targets, showing the 
power of the invention to distinguish a wild-type target 
sequence from one containing the AF5Q8 mutation and to 
detect a mixture of the two sequences. 

The results above clearly demonstrate how the DNA chips 
of the invention can be used to detect a deletion mutation, 
AF508; another model system was used to show that the 
chips can also be used to detect a point mutation as well. One 
of the more frequent mutations in the CFTR gene is G480C, 
which involves the replacement of the G in position 46 of 
exon 10 by a T, resulting in the substitution of a cysteine for 
the glycine normally in position #480 of the CFTR protein. 
The model target sequences included the 21-mer probe 
wt480 to represent the wild-type sequence at positions 
37-55 of exon 10: 5'-CCTTCAGAGGGTAAAXTTAAG 
(SEQ. ID NO:18) and the 21-mer probe mu480 to represent 
the mutant sequence: S'-CCTTCAGAGTGTAAAATTAAG 
(SEQ. ID NO:19). 



10 



12 



terns. The wild-type sequence could easily be read from the 
chip, but the probe that bound the mu480 target so well when 
only the mu480 target was present also bound it well when 
both the mutant and wild-type targets were present in a 
mixture, making the hybridization pattern easily distinguish- 
able from that of the wild-type target alone. These results 
again show the power of the DNA chips of the invention to 
detect point mutations in both homo- and heterozygous 
individuals. 

To demonstrate clinical application of the DNA chips of 
the invention, the chips were used to study and detect 
mutations in nucleic acids from genomic samples. Genomic 
samples from a individual carrying only the wild-type gene 
and an individual heterozygous for AF508 were amplified by 
PCR using exon 10 primers containing the promoter for T7 
RNA polymerase. Illustrative primers of the invention are 
shown below. 



Exoa Name Sequence 



10 CR9-T7 TAATACGACTCACTATAGGGAGatgacctaataatgatgggttt (SEQ. ED. NO:20) 

10 CFil0c-T7 TAATACXjACTrACTATAGGGAGtagtgtgaagggttcatatgc (SEQ. ID. NO:21) 

10 CFil0c-T3 CTCGGAATTAA(XCn^CTAAAGGtagtgtgaagggttcatatg (SEQ. ID. NO:22) 
10, 11 CFil0-T7 TAATACGACTCACTATAGGGAGagcatactaaaagtgactctc (SEQ. ID. NO:23) 

11 CFillc-T7 TAATACGACIX^CTATAGGGAGacatgaatgacatttacagcaa (SEQ. ID. NO:24) 
11 CFillc-T3 CGGAATTAACCCTCACTAAAGGacatgaatgacatttacagcaa (SEQ. ID. NO:25) 



In separate experiments, a DNA chip was hybridized to 
each of the targets wt480 and mu480, respectively, and then 
scanned with a confocal microscope. FIG. 5, in panels A, B f 
and C, shows an image made from the region of a DNA chip 
containing CFTR exon 10 probes; in panel A, the chip was 
hybridized to the wt480 target; in panel C, the chip was 
hybridized to the mu480 target; and in panel B, the chip was 
hybridized to a mixture of the wild-type and mutant targets. 
FIG. 6, in sheets 1-3, corresponding to panels A, B, and C 
of FIG. 5, shows graphs of fluorescence intensity versus 
tiling position. The labels on the horizontal axis show the 
bases in the wild-type sequence corresponding to the posi- 
tion of substitution in the respective probes. Plotted are the 
intensities observed from the features (or synthesis sites) 
containing wild-type probes, the features containing the 
substitution probes that bound the most target ("called"), and 
the feature containing the substitution probes that bound the 
target with the second highest intensity of all the substitution 
probes ("2nd Highest"). 

These figures show that the chip could be used to 
sequence a 16-base stretch from the center of the target 
wt480 and that discrimination against mismatches is quite 
good throughout the sequenced region. When the DNA chip 
was exposed to the target mu480, only one probe in the 
portion of the chip shown bound the target well: the probe 
in the set of probes devoted to identifying the base at 
position 46 in exon 10 and that has an A in the position of 
substitution and so is fully complementary to the central 
portion of the mutant target. All other probes in that region 
of the chip have at least one mismatch with the mutant target 
and therefore bind much less of it. In spite of that fact, the 
sequence of mu480 for several positions to both sides of the 
mutation can be read from the chip, albeit with much- 
reduced intensities from those observed with the wild-type 
target. 

The results also show that, when the two targets were 
mixed together and exposed to the chip, the hybridization 
pattern observed was a combination of the other two pat- 



30 These primers can be used to amplify exon 10 or exon 11 
sequences; in another embodiment, multiplex PCR is 
employed, using two or more pairs of primers, to amplify 
more than one exon at a time. 
The product of amplification was then used as a template 
35 for the RNA polymerase, with fluoresceinated UTP present 
to label the RNA product. After sufficient RNA was made, 
it was fragmented and applied to an exon 10 DNA chip for 
15 minutes, after which the chip was washed with hybrid- 
ization buffer and scanned with the fluorescence micro- 
40 scope. A useful positive control included on many CF exon 
10 chips is the 8-mer 3'-CGCCGCCG-5\ FIG. 7, in panels 
A and B, shows an image made from a region of a DNA chip 
containing CFTR exon 10 probes; in panel A, the chip was 
hybridized to nucleic acid derived from the genomic DNA of 
45 an individual with wild-type AF508 sequences; in panel B, 
the target nucleic acid originated from a heterozygous (with 
respect to the AF508 mutation) individual. FIG. 8, in sheets 
1 and 2, corresponding to panels A and B of FIG. 7, shows 
graphs of fluorescence intensity versus tiling position. 
50 These figures show that the sequence of the wild-type 
RNA can be called for most of the bases near the mutation. 
In the case of the AF508 heterozygous carrier, one particular 
probe, the same one that distinguished so clearly between 
the wild-type and mutant oligonucleotide targets in the 
55 model system described above, in the T-lane binds a large 
amount of RNA, while the same probe binds little RNA from 
the wild-type individual. These results show that the DNA 
chips of the invention are capable of detecting the AF508 
mutation in a heterozygous carrier. 
60 Thus, the present invention provides methods for synthe- 
sizing large numbers of oligonucleotide probes on a glass 
substrate and unique probe sets in a defined array in which 
the probes are arranged in the array by the "tiling" method 
of the invention. The DNA chips produced by the method 
65 can be used to detect mutations in particular sequences of a 
target nucleic acid, such as genomic DNA or RNA produced 
from transcription of an amplified genomic DNA. These 
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chips can be used to detect both point mutations and small some applications to using a minimal set of oligonucleotides 

deletions. Moreover, the pattern of hybridization to the chip specific to the sequence of interest, rather than a set of all 

allows inferences to be drawn about the sequences of the possible N-mers. Some of these advantages include: (i) each 

mutant DNAs. position in the array is highly informative, whether or not 

For example, in the model system involving the cystic 5 hybridization occurs; (ii) nonspecific hybridization is mini- 

fibrosis point mutation G480C, the A-lane probe whose mized; it ^ stfa ightforward to correlate hybridization 

position of substitution corresponds to the position of the differen ces with sequence differences, particularly with ref- 

mutation does not bind much wild-type target, because in the crence tQ thc hybridizalioo pattcra of a standard; and 

wild-type sequence, a G occupies that position However it (iy) ^ abm address cach independeDUv during 

binds mutant target very well, allowing one to infer correctly 10 v . . ' . . . ,.. *.. i-.u t_ « . 

that the mutation involves a change of that G to a T Synthe t S * •™ n « . hlg \^^n photohthography, aUows the 

Similarly, in the case of the three-base deletion in cystic array l ° b u e ? esl S° ed f and 0P^n>»d for any sequence. For 

fibrosis known as AF508, the T-lane probe that binds mutant e * am P Ie the len S th of P robe can be vaned independently 

target so intensely is responding to the fact that the deletion °* tne otners - 

has brought a CAT sequence into the position occupied by 15 The present invention illustrates these advantages by 

a CI I sequence in the wild-type target. The DNA chips of providing DNA chips and analytical methods for detecting 

the invention can be used to detect and sequence not only specific sequences of human mitochondrial DNA- In one 

known mutations in an organism's genome but also new preferred embodiment, the invention provides a DNA chip 

mutations not previously characterized. The DNA chips and for analyzing sequences contained in a 1.3 kb fragment of 

methods of the invention can also be used to detect specific 20 human mitochondrial DNA from the "D-loop" region, the 

sequences in other CFTR exons as well as other human most polymorphic region of human mitochondrial DNA 

genes for purposes of research and clinical genetic analysis, One such chip comprises a set of 269 overlapping oligo- 

as demonstrated below. nucleotide probes of varying length in the range of 9-* 14 

Detection of Specific Human Mitochondrial DNA nucleotides with varying overlaps arranged in -600x600 

Sequences with DNA Chips 25 micron features or synthesis sites in an array 1 cmxl cm in 

As noted above, the present invention provides DNA size. The probes on the chip are shown in columnar form 

chips on which a known DNA sequence is represented as an below. An illustrative mitochondrial DNA chip of the inven- 

array of overlapping oligonucleotides on a solid support. tion comprises the following probes (X, Y coordinates are 

This set of oligonucleotides is used to probe a target nucleic shown, followed by the sequence; "DL3" represents the 

acid comprising the known sequence, allowing mutations to 30 3'-end of the probe, which is covalently attached to the chip 

be detected. As. also noted above, there are advantages in surface.) 
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DL3AGTGGGGTATTT 


(SEQ ID. NO:26) 
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10 


2 
2 
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0 


DL3GGGTATTTAGTT 


(SEQ ID. NO:27) 
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DL3TTAGTn*ATCCAA 


(SEQ ID. NO:28) 
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DL3ATCCAAACCAGG 


(SEQ ID. NO:29) 
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DL3ACCAGGATCGGA 


(SEQ ID. NO:30) 
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DL3CGTGTGTGTGTGG 


(SEQ ID. NO:31) 
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0 




: (SEQ ID. NO:32) 
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D L3TCGTGTGTGTGTGG 


(SEQ ID. NO L33) 
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DL3GTAGGATGGGTC 


(SEQ ID. NO:34) 
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DL3AGGATGGGTCGT 


(SEQ ID. NOlJS) 
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10 
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D L3G ATGGGTCGTGT 


(SEQ ID. NO:36) 
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DL3TGGCGACGATTG 


(SEQ ID. NO 31) 
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12 
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D L3 GCG ACG ATTGGG 


(SEQ ID. NO:38) 
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DL3TGGGGGGGA 
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14 
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DL3GAGGGGGOG 
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15 
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DL3GGAGGGGGCGA 


(SEQ ID. NO:39) 
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16 


0 


DL3GAGGGGGCGA 


(SEQ ID. NO:40) 


9 


3 


0 


1 


D L3 GGCTTGGTITjG 


(SEQ ID. NO:41) 


10 
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DL3GGTTGGTTTGGG 


(SEQ ID. NO:42) 


11 
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DUTGGGGTTTCTAG 


(SEQ ID. NO:43) 


12 
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DL3GTTTCTAGTGGG 


(SEQ ID. NO:44) 


13 
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DL3AGTGGGGGGTGT 


(SEQ ID. NO:45) 


14 
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D L3 GGGGTGTC AAAT 


(SEQ ID. NO:4$) 
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D L3 GTC AAATACATCG 


(SEQ ID. NO:47) 
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DL3ACATCGAATGGAG 


(SEQ ID. NO:4S) 
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D L3 CG AATGG AGG AG 


(SEQ ID. NO:49) 
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DL3GAGGAGTTTCGT 


(SEQ ID. NO:50) 
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10 
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D L3TTTCGTTATGTGA 


(SEQ ID. NO:51) 
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D L3 ATGTG ACTTTTAC 


(SEQ ID. NO:52) 
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DL3GACTTTTACAAAT 


(SEQ ID. NO:53) 
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13 
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DL3AAATCTGCCCGA 


(SEQ ID. NO:54) 
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14 
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DL3AATCTGCCCGAG 


(SEQ ID. NO:55) 
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DL3CCCGAGTGTAGT 


(SEQ ID. N056) 
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D L3 AGTGTAGTGGGG 


(SEQ ID. NO:57) 
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DL3GGGAGGGTGAG 


(SEQ ID. NO:58) 
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1 
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D L3 GGTG AGGGTATG 


(SEQ ID. NO:59) 
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D L3 GGTATG ATG ATTAG 


(SEQ ID. NO:60) 
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DL3GATTAGAGTAAGT 


(SEQ ID. NO:61) 


13 


4 
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DL3TTAGAGTAAGTTA 


(SEQ ID. NO:62) 


14 


4 



DL3GGTAGGATGGGT 
DL3GGATGGGTCGTG 
DL3GGTCGTGTGTGT 
D L3GTGTGTGTGGCG 
DL3TGTGGCGACGAT 
DL3GACGATTGGGGT 
DL3ATTGGGGTATGG 
DUGTATGGGGCTTG 
DL3GGATTGTGGTCG 
DL3TGGTCGGATTGG 
DL3GGATTGGTCTAAA 
DL3TCTAAAGTTTAAA 
D L3GTTTAAAATAG AA 
DL3ATAGAAAAACCG 
DL3AGAAAAACCGC 
* DL3AACCGCCATAC 
DL3CCATACGTGAAAA 
DL3ACGTGAAAATTGT 
DL3AATTGTCAGTGGG 
D L3TGTCAGTGGGGG 
DL3TGGGGTTGA 
DL3GGGTTGA7TGTGT 
DL3TTGTGTAATAAAA 
DL3AATAAAAGGGGA 
DL3TAAAAGGGGAGG 
DL3GTTTTTTAAAGG 
DL3TTTTAAAGGTGG 
DL3AGGTGGTTTGG 
DIJTTGGGGGGGAG 
DL3GGAGGGGGCG 
DL3GGGGCGAAGAC 
DL3GAAGACCGGATG 
DL3CCGGATGTCGTG 
DL3GTCGTGAATTTGT 
DL3CGTGAATTTGTGT 
DL3TTGTGTAGAGACG 
DL3TAGAGACGGTTT 
DL3ACGGTTTGGGG 
DL3TGGGGTTTTTGT 
DL3GGGTTTTTGTTT 



(SEQ ID. NO:67) 
(SEQ ID. NO:68) 
(SEQ ID. NO:69) 
(SEQ ID. NO:70) 
(SEQ ID. NO:71) 
(SEQ ID. NO:72) 
(SEQ ID. NO:73) 
(SEQ ID. NO:74) 
(SEQ ID. NO:75) 
(SEQ ID. NO:76) 
(SEQ ID. NO:77) 
(SEQ ID, NO:78) 
(SEQ ID. NO:79) 
(SEQ ID. NO:80) 
(SEQ ID. NO:81) 
(SEQ ID. NO:82) 
(SEQ ID. NO:83) 
(SEQ ID. NO:84) 
(SEQ ID. NO:85) 
(SEQ ID. NO*6) 
(SEQ ID. NO:87) 
(SEQ ID. NO:88) 
(SEQ ID. NO:89) 
(SEQ ID. NO:90) 
(SEQ ID. NO:91) 
(SEQ ID. NO:92) 
(SEQ ID. NO:93) 
(SEQ ID. N054) 
(SEQ ID. N0:95) 
(SEQ ID. N0:96) 
(SEQ ID. N0:97) 
(SEQ ID. NO:98) 
(SEQ ID. NO:99) 
(SEQ ID. NO:100) 
(SEQ ID. NO:101) 
(SEQ ID. NO:102) 
(SEQ ID. NO:103) 
(SEQ ID. NO:104) 
(SEQ ID. NO:105) 
(SEQ ID. NO:106) 
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D L3 AAGTTATGTTGGG 
DL3GTTGGGGGCG 
DL3GGGGGGGGTA 
DL3GCGGGTAGGAT 
DL3ACACAATTAATTAA 
D L3 AATTAAITACG AA 
DL3TACGAACATCCTG 
DL3ACGAACATCCTGT 
DL3TCCTGTATTATTA 
D L3 GTATTATTATTGTT 
D L3 ATTGTTAAACTTA 
DL3AAACTTACAGACG 
DUACAGACGTGTCG 
D L3 GTGTCGGTG AAA 
DL3GTGAAAGGTGTGT 



10 11 

11 11 

12 a 

13 11 

14 a 

15 a 

16 a 



DUTGTGTCTGTAGTA 

DUGTAGTATTGTTTT 

D L3 AGTATTGTnTTT 

DL3CCTCGTGGGATA 

DL3TGGGATACAGCG 

DL3GATACAGCGTCAT 

DL3GCGTCATAGACAG 

DL3AGACAGAAACTAA 

DL3CAGAAACTAAGGA 

D L3TAAGG ACGG AGT 

DL3GACGGAGTAGGA 

DL3GTAGGATAATAAA 

DL3TAATAAATAGCG 

DL3ATAGCGTAGGAT 

DL3TAGCGTAGGATG 

DL3AGGATGCAAGTT 

DL3ATGCAAGTTATAA 

D L3 GTTATAATGTCCG 

DL3ATCTCCGCITGT 

D UTCCGCTTGTATG 

DL3GTGAGTGCCCTC 

DIJTGCCCTCGAGAG 

DL3 CCTCGAGAGGTA 

DUAGAGGTACGTAA 

DL3AOGTAAACCATA 

DL3ACCATAAAAGCAG 

DL3AAAGCAGACCC 

DL3AGACCCCCCAT 

DL3CCCCCATACGT 

D L3 CATACGTGCGCT 

DL3GTGCGCTATCAG 

DL3GCGCTATCAGTA 

DUTCAGTAACGCTC 

DUGTAACGCTCTGC 

DL3AGTCTATCCXCA 

DL3ATCCCCAGGGA 

DL3 CAGGG AACTGGT 

DUACTGGTGGTAGG 

DL3CTGGTGGTAGGA 

DL3GTAGGAGGCACA 

DL3GGCACATTTAGT 

D L3TTTAGTTATAGGG 

DL3AGGTTTACGGTG 

DL3TACGGTGGGGA 

DL3GTGGGGAGTGG 

D L3 GGG AGTGGGTG A 

D L3 GGGTG ATCCTATG 

D L3 CXTTATGGTTXjTTT 

DL3GGTTCTTTGGATG 

DL3GTTTGGATGGGT 

DL3ATGGGTGGGAAT 

DL3GGGAATTGTCATG 

DL3GTCATGTATCATGT 

DL3TCATGTATTTCGG 

DL3TATTTCGGTAAA 

D L3TTCGGTAAATGG 

D L3 GTAAATGGCATGT 

D L3 GCATGTAATCGTG 

DL3GTAATCGTGTAAT 

DL3GGGAGGGGTAC 

D L3GGGTACG AATGT 

DL3ACGAATGTTCGTT 

DL3TGTTCGTTCATGT 

D L3 CGTTCATGTCGTT 



(SEQ ID 
(SEQ ID 
(SEQ ID 
(SEQ ID, 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID, 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID, 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 
(SEQ ID. 



NO:63) 
NO:64) 
NO:65) 
NO:66) 
NO:lll) 
NO:112) 
NO:113) 
NO:114) 
NO:115) 
NO:ll6) 
NO:117) 
NO:118) 
NO:119) 
NO:120) 
NO:121) 
NO:122) 
NO:123) 
NO: 124) 
NO:125) 
NO: 126) 
N0:127) 
NO: 128) 
NO:129) 
NO:130) 
NO:131) 
NO:132) 
NO:133) 
NO:134) 
N0:135) 
NO:136) 
NO:137) 
NO:138) 
NO:139) 
NO:140) 
NO:141) 
NO:142) 
NO:143) 
NO:144) 
NO:145) 
NO:146) 
NO:147) 
NO:148) 
NO:149) 
NO:150) 
NO:l51) 
N0:152) 
NO:153) 
NO:154) 
NO:155) 
NO:156) 
NO:203) 
NO:204) 
NO:205) 
NO:206) 
NO:207) 
NO:208) 
NO:209) 
NO:210) 
NO:211) 
NO:212) 
NO:213) 
NO:214) 
NO:215) 
NO:216) 
NO:217) 
NO:218) 
NO:219) 
NO:220) 
N0221) 
NO:222) 
N0223) 
NO:224) 
NO:225) 
NO:226) 
NO:227) 
NO:228) 
NO:229) 
NO:230) 
NO:231) 
NO:232) 



4 
4 
5 
5 
7 
7 
7 
8 
8 
8 
8 
8 
8 
8 
8 

8 8 

9 8 

10 8 

11 8 

12 8 

13 8 

14 8 

15 8 

16 8 
0 9 



15 
16 
0 
1 

14 

15 

16 

0 

1 

2 

3 

4 

5 

6 

7 



13 9 

14 9 



15 

16 

0 

1 

2 

3 

4 

5 

6 

7 

8 

11 



12 13 

13 13 

14 13 

15 13 

16 13 
5 14 
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10 15 

a 15 

12 15 

13 15 

14 15 

15 15 

16 15 



DUTTGTTTCTTGGG 
DIJTCTTGGGATTGTG 
DL3TGTATGAATGATTT 
D L3TG ATTTC ACAC AA 
DL3CTCTGOGACCTC 
DOGACCTCGGCCT 
DL3TCGGCCTCGTG 
D L3G ATG AAGTCCCAG 
DL3AGTCCCAGTATTT 
DL3GTATTTCGGATTT 
DL3TCGGATTTATCG 
DL3GATTTATCGGGT 
DL3ATCGOGTGTGCA 
DL3TGTGCAAGGGGA 
DL3CAAGGGGAATTT 
D L3G AATTTATTCTGTA 
DL3TCTGTAGTGCTAC 
D L3GTAGTGCTACCT 
DL3GCTACCTAGTAG 
DL3CTAGTAGTOCAGA 
DL3TCCAGATA9TGGG 
DL3AG ATAGTGG G ATA 
DL3GGGATAATTGGT 
DL3TAATTGGTGAGTG 
DL3TATAGGGCGTGT 
DL3GGGCGTGTTCTCA 
DL3GTGTTCTCACGAT 
DL3TCACGATGAGAGG 
DL3ATGAGAGGAGCG 
DL3AGGAGCGAGGC 
DL3CGAGGCCCGG 
DL3GCCCGGGTATT 
D L3CGGGTATTGTGA 
DL3GTGAACCCCCAT 
DL3CCCCATCGATTT 
DIJATCGATTTCACrr 
DL3TTTCACTTGACAT 
DL3TTGACATAGAGCT 
DL3TAGAGCTGTAGAC 
DL3GTAGACCAAGGA 
DL3ACCAAGGATGAAG 
D L3CGTGTAATGTC AG 
D L3TGTCAGTTTAGGG 
DL3TCAGTTTAGGGA 
DL3TAGGGAAGAGCA 
DL3AAGAGCAGGGGT 
DL3CAGGGGTACCTA 
DL3GGTACCTACTGG 
DUTACTGGGGGGA 
D L3GGGGG AGTCTAT 
DL3CATGTA1 1 1 IlGG 
D L3TTTTGGGTTAGG 
DL3GGGTTAGGATCT 
DOGGATGTAGTTTTG 
DL3TGTAGTTTTGGG 
DIJTTTGGGGGAGG 
DL3GGGTTCATAACTG 
DL3ATAACTGAGTGGG 
D L3 AACTG AGTGGGT 
DL3GTGGGTAGTTGT 
DL3GTAGTTGTTGGC 
DL3GTTGGCGATACA 
D L3CG ATACATAAAAG 
DL3TAAAAGCATGTAA 
DL3GCATGTAATGACG 
DL3ATGACGGTCGGT 
DL3GTCGGTGGTACT 
DL3GGTACTTATAACA 
D L3TCG ATTCTAAG AT 
D L3TAAG ATTAAATTT 
DUAAATTTGAATAAG 
DL3AATAAGAGACAAG 
DL3AAGAGACAAGAAA 
D L3 AAG AAAGTACCC 
DL3AAAGTACCCCTT 
DUCCCCTTCGTCTA 
DL3CTTCGTCTAAAC 
D L3CTAAA CCC ATGG 
D L3 AACCC ATGGTGG 
DUTGGTGGGTTCAT 



(SEQ ID. NO: 107) 
(SEQ ID. NO:108) 
(SEQ ID. NO: 109) 
(SEQ ID. NO:110) 
(SEQ ID. NO:157) 
(SEQ ID. NO:l58) 
(SEQ ID. NO:159) 
(SEQ ID. NO:160) 
(SEQ ID. NO:161) 
(SEQ ID. NO:162) 
(SEQ ID. NO:163) 
(SEQ ID. NO:164) 
(SEQ ID. NO:165) 
(SEQ ID. NO:166) 
(SEQ ID. NO:167) 
(SEQ ID. NO:l 68) 
(SEQ ID. NO:169) 
(SEQ ID. NO:170) 
(SEQ ID. NO:171) 
(SEQ ID. NO:172) 
(SEQ ID. NO:173) 
(SEQ ID. NO:174) 
(SEQ ID. NO:175) 
(SEQ ID. NO:176) 
(SEQ ID. NO:177) 
(SEQ ID. NO:178) 
(SEQ ID. NO:179) 
(SEQ ID. NO:180) 
(SEQ ID. NO:18l) 
(SEQ ID. NO:182) 
(SEQ ED. NO:183) 
(SEQ ID. NO:184) 
(SEQ ID. NO:185) 
(SEQ ID. NO:186) 
(SEQ ID. NO:187) 
(SEQ ID. NO:188) 
(SEQ ID. NO:189) 
(SEQ ID. NO:190) 
(SEQ ID. NO:191) 
(SEQ CD. NO:192) 
(SEQ LD. NO:193) 
(SEQ CD. NO:194) 
(SEQ CD. NO:195) 
(SEQ ED. NO:196) 
(SEQ ED. NO:197) 
(SEQ ED. NO:198) 
(SEQ CD. NO:199) 
(SEQ ED. NO:200) 
(SEQ CD. NO:201) 
(SEQ ED. NO:202) 
(SEQ ED. NO:246) 
(SEQ CD. NO:247) 
(SEQ ED. NO:248) 
(SEQ ED. NO:249) 
(SEQ ED. NO:250) 
(SEQ ED. NO:25l) 
(SEQ ED. NO:252) 
(SEQ ED. NO:253) 
(SEQ CD. NO:254) 
(SEQ ED. NO:255) 
(SEQ CD. NO:256) 
(SEQ ED. NO:257) 
(SEQ ED. NO:258) 
(SEQ ED. NO:259) 
(SEQ CD. NO:260) 
(SEQ ID. NO:261) 
(SEQ ED. NO:262) 
(SEQ ED. NO:263) 
(SEQ CD. NO:264) 
(SEQ ED. NO:265) 
(SEQ ED. NO:266) 
(SEQ ED. NO:267) 
(SEQ ED. NO:268) 
(SEQ CD. NO:269) 
(SEQ ED. NO:270) 
(SEQ ED. NO:271) 
(SEQ ID. NO:272) 
(SEQ ID. NO:273) 
(SEQ ED. NO:274) 
(SEQ ED. NO:275) 
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•continued 



10 


12 


D L3 GTCGTTAGTTGG 


(SEQ ID. NO:233) 


5 


16 


DL3TTGGAAAAAGGT 


(SEQ ID. NO:276) 


Jl 


12 


DL3TAGTTGGOAGTT 


(SEQ ID. NO:234) 


6 


16 


DUAAAAGGTTCCTG 


(SEQ ID. NO:277) 


12 


12 


D L3 GG AGTTG ATAGTG 


(SEQ ID. N0235) 


7 


16 


DL3GGTTCCTGTTTA 


(SEQ ID. NO:278) 


13 


12 


D L3 ATAGTGTGTAGTT 


(SEQ ID. NO:236) 


8 


16 


D L3CCTGTTTAGTCTC 


(SEQ ID. NO:279) 


14 


12 


D L3 GTFTAGTTG ACGT 


(SEQ ID. NO:237) 


9 


16 


D1JTTAGTCTCTTTTT 


(SEQ ID. NO:280) 


15 


12 


D L3TG ACGTTG AGGT 


(SEQ ID. NO:238) 


10 


16 


DL3CniriCAGAAAT 


(SEQ ID. NO:2Sl) 


16 


12 


D L3 CGTTG AGGTTTA 


(SEQ ID. NO:239) 


11 


16 


DL3AGAAATTGAGGTG 


(SEQ ID. NO:282) 


5 


13 


DL3TATAACATGCCAT 


(SEQ ID. NO:240) 


12 


16 


D L3 AAATTG AGGTGGT 


(SEQ ID. NO:283) 


6 


13 


D L3 AACATGCC ATGGT 


(SEQ ID. NO:241) 


13 


16 


DL3GGTGGTAATCGT 


(SEQ ID. NO:284) 


7 


13 


DL3CCATGGTATTAT 


(SEQ ID. NO:242) 


14 


16 


DL3TAATCGTGGGTT 


(SEQ ID. NO:2S5) 


8 


13 


D L3 AJTTATG AACTGG 


(SEQ ID. NO:243) 


15 


16 


D L3GTGGGTTTCG AT 


(SEQ ID. NO:286) 


9 


13 


DL3AACTGGTGGACAT 


(SEQ ID. NO:244) 


16 


16 


D L3GGTTTCG ATTCT 


(SEQ ID. NO:287) 


10 


13 


DL3TGGACATCATGTA 


(SEQ ID. NO:245) 









No probes were present in positions X, Y=0, 12 to X, Y-4, and in several cases, the differences were within noise levels. 
12; X, Y«0, 13 to X, Y«4, 13; X, Y«0, 14 to X, Y»4, 14; X, Improvements can be realized by increasing the amount of 
Y-0, 15 to X, Y-4, 15; X,Y-0, 16 to X, Y-4, 16; The length overlap between probes and hence overall probe density 
of each of the probes on the chip was variable to minimize and, for duplex DNA targets, using a second set of probes, 
differences in melting temperature and potential for cross- either on the same or a separate chip, corresponding to the 
hybridization. Each position in the sequence is represented 20 second strand of the target. FIG. 14, in sheets 1 and 2, shows 
by at least one probe and most positions are represented by a plot of normalized intensities across rows 10 and 11 of the 
2 or more probes. As noted above, the amount of overlap array and a tabulation of the mutations detected, 
between the oligonucleotides varies from probe to probe. FIG. 15 shows the discrimination between wild-type and 
FIG. 9 shows the human mitochondrial genome; u O H n is the mutant hybrids obtained with this chip. The median of the 
H strand origin of replication, and arrows indicate the cloned 25 six normalized hybridization scores for each probe was 
unshaded sequence. taken. The graph plots the ratio of the median score to the 
DNA was prepared from hair roots of six human donors normalized hybridization score versus mean counts. On this 
(mtl to mt6) and then amplified by PCR and cloned into graph, a ratio of 1.6 and mean counts above 50 yield no false 
M13; the resulting clones were sequenced using chain positives, and while it is clear that detection of some mutants 
terminators to verify that the desired specific sequences were 30 . can be improved, excellent discrimination is achieved, con- 
present. DNAfrom the sequenced M13 clones was amplified sidering the small size of the array. FIG, 16 illustrates how 
by PCR, transcribed in vitro, and labeled with fluorescein- the identity P f the base mismatch may influence the ability 
UTP using T3 RNA polymerase. The 1.3 kb RNA transcripts t0 discriminate mutant and wild-type sequences more than 
were fragmented and hybridized to the chip. The results the P osition of , the mismatch within an oligonucleotide 
showed that each different individual had DNA that pro- 35 f robc j J 1 * mismatch position is expressed as ft of probe 
duced a unique hybridization fingerprint on the chip and that len * b [ ™ m lhe 3 J? nd h ^ * * d ! cated 00 tbe 
the differences in the observed patterns could be correlated graph. Tljese results show that the DNA chip increases the 
with differences in the cloned genomic DNA sequence. The the f standard reverse dot blot format by orders of 
u , . . . j t. i r magnitude, extending the power of that approach many fold 
results also demonstrated that very long sequences of a and mat the melnods 6 of the invention are more efficient and 
target nucleic acid can be represented comprehensively as a 40 easier t0 automate than ge l-based methods of nucleic acid 
specific set of overlapping oligonucleotides and that arrays sequence and mutation analysis. 

of such probe sets can be usefully applied to genetic analy- 7^ advantages become more apparent as chips with 

SiS - more and more probes are employed. To illustrate, the 

The sample nucleic acid was hybridized to the chip in a present invention provides a DNA chip for analyzing human 

solution composed of 6xSSPE, 0.1% Triton-X 100 for 60 45 mitochondrial DNA (mtDNA) that "tiles" through 648 

minutes at 15° C The chip was then scanned by confocal nucleotides of human H strand mtDNA from positions 

scanning fluorescence microscopy. The individual features 16280 to 356. The probes in the array are 15 nucleotides in 

on the chip were 588x588 microns, but the lower left 5x5 length, and each position in the target sequence is repre- 

square features in the array did not contain probes. To sented by a set of 4 probes (A, C, G, T substitutions), which 

quantitate the data, pixel counts were measured within each 50 differed from one another at position 7 from the 3'-end. The 

synthesis site. Pixels represent 50x50 microns. The fluores- arra y consists of 13 blocks of 4x50 probes: each block scans 

cence intensity for each feature was scaled to a mean through 50 nucleotides of contiguous mtDNA sequence. The 

determined from 27 bright features. After scanning, the chip block ? are separated by blank rows. The 4 corner columns 

was stripped and rehybridized; all six samples were hybrid- c0 ° tain m ^ P robes i there «* » total ° f 26 ?° P r ° bcs m a 

ized to the same chip. FIG. 10 shows the image observed 55 \f* cm ^ uare area < feature >' and each m * 15 

from the mt4 sample on the DNA chip. FIG. 11 shows the f u 1 ^u^' ♦ twa a u d™ r 

u *u *c 11. i^xta i_- 1-1^ Labeled RNA target DNA was prepared by PCR amph- 

image observed from the mtf sample on the DNAchip. FIG. fication of g 13 k * ^ of ^ J 

12 shows the predicted difference image between the mt4 itions 15935 tQ 66?j donin mtQ M13 (sequence verifi . 

and mtS samples on the DNA chip based on mismatches cation was p erform ed) f and reamplification of the cloned 

between the two samples and the reference sequence (see 60 sequences using primers tagged with T3 and T7 RNA 

Anderson et al., 1981, Nature 290: 457-465, incorporated polymerase promoter sequences and in vitro transcription to 

herein by reference). FIG. 13 shows the actual difference produce fluorescein-UTP labeled RNA. The RNA was frag- 

unage observed. mented and hybridized to the oligonucleotide array in a 

The results show that, in almost all cases, mismatched solution composed of 6xSSPE, 0.1% Triton X-100 for 60 

probe/target hybrids resulted in lower fluorescence intensity 65 minutes at 18° C. Unhybridized material was washed away 

than perfectly matched hybrids. Nonetheless, some probes with buffer, and the chip was scanned at 25 micron pixel 

detected mutations (or specific sequences) better than others, resolution. 
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FIG. 17 provides a 5' to 3* sequence listing of one target between the particular mutation in p53 and the functioning 
corresponding to the probes on the chip. X is a control probe. of the resulting protein. Furthermore, there are projects 
Positions that differ in the target (i.e., are mismatched with looking at the germline inheritance of p53 mutations and the 
the probe at the designated site) are in bold. FIG. 18 shows development of cancer. The present invention provides 
the fluorescence image produced by scanning the chip when 5 useful DNA chips and methods for such studies, 
hybridized to this sample. About 95% of the sequence could . In addition, the present invention also provides a diag- 
be read correctly from only one strand of the original duplex nostic test kit and method and p53 probes immobilized on a 
target nucleic acid. Although some probes did not provide DNA chip in an organized array. Currently available diag- 
excellent discrimination and some probes did not appear to nostic tests for cancer typically have a sensitivity of about 
hybridize to the target efficiently, excellent results were 10 50%. The present invention provides significant advantages 
achieved. The target sequence differed from the probe set at over such tests, and in one embodiment provides a method 
six positions: 4 transitions and 2 insertions. All 4 transitions for detecting cancer-causing mutations in p53 that involves 
were detected, and specific probes could readily be incor- the steps of (1) obtaining a biopsy, which is optionally 
porated into the array to detect insertions or deletions. FIG. fractionated by cryostat sectioning to enrich tumor cells to 
19 illustrates the detection of 4 transitions in the target is about 80% of the total cell population. The DNA or RNA is 
sequence relative to the wild-type probes on the chip. then extracted, amplified, and analyzed with a DNA chip for 

These results illustrate that longer sequences can be read the presence of p53 mutations correlated with malignancy, 
using the DNA chips and methods of the invention, as To illustrate the value of the DNA chips of the present 
compared to conventional sequencing methods, where read- invention in such a method, a DNA chip was synthesized by 
ing length is limited by the resolution of gel electrophoresis. 20 the VLSIPS™ method to provide an array of overlapping 
Similar results were observed when genomic DNA samples probes which represent or tile across a 60 base region of 
were prepared from human hair roots. Hybridization and exon 6 of the p53 gene. To demonstrate the ability to detect 
signal detection require less than an hour and can be readily substitution mutations in the target, twelve different single 
shortened by appropriate choice of buffers, temperatures, substitution mutations (wild type and three different substi- 
probes, and reagents. In principle, longer sequence reads can 25 tutions at each of three positions) were represented on the 
be obtained than by conventional sequencing, where reading chip along with the wild type. Each of these mutations was 
length is limited by the resolution of gel electrophoresis. represented by a series of twelve 12-mer oligonucleotide 
P53 Sequencing and Diagnostic DNA Chips probes, which were complementary to the wild type target 

P53 is a tumor suppressor gene that has been found to be except at the one substituted base. Each of the twelve probes 
. mutated in most forms of cancer (see Levine et al, 1991, 30 was complementary to a different region of the target and 
Nature 351: 453-456, and Hollstein et al., 1991, Science contained the mutated base at a different position, e.g., if the 
253: 49-53, . each of which is incorporated herein by substitution was at base 32, the set of jprobes would be 
reference). In addition, there is a hereditary syndrome, complementary-with the exception of base 32 — to regions 
Li-Fraumeni, in which individuals inherit mutant alleles of of the target 21-32, 22-33, and 32-43). This enabled inves- 
p53 and tend to have cancer at relatively young ages 35 tigation of the effect of the substitution position within the 
(Frebourg et al, 1992, PNAS 89: 6413-6417, incorporated probe. The alignment of some of the probes with a 12-mer 
herein by reference). During the development of a cancer, model target nucleic acid is shown in FIG. 20. 
p53 is inactivated. The course of p53 inactivation generally To demonstrate the effect of probe length, an additional 
involves a mutation in one copy of p53 and is often followed series of ten 10-mer probes was included for each mutation 
by deletion of the other copy. After p53 is inactivated, 40 (see FIG. 21). In the vicinity of the substituted positions, the 
chromosomal abnormalities begin to appear in tumors. In wild-type sequence was represented by every possible over- 
the best understood form of cancer, colorectal cancer, well lapping 12-mer and 10-mer probe. To simplify comparisons, 
over 50%, perhaps 80%, of all patients with tumors have p53 the probes corresponding to each varied position were 
mutations. In addition, p53 mutations have been found in a arranged on the chip in the rectangular regions with the 
high proportion of lung, breast, and other tumors (Rodrigues 45 following structure: each row of cells represents one 
et al., 1990, PNAS 87: 7555-7559, incorporated herein by substitution, with the top row representing the wild type, 
reference). According to data presented by David Sidransky Each column contains probes complementary to the same 
(1992 San Diego Conference), over 400 mutations in p53 are region of the target, with probes complementary to the 
known. 3'-end of the target on the left and probes complementary to 

The p53 gene spans 20 kbp in humans and has 11 exons, 50 the 5'-end of the target on the right. The difference between 
10 of which are protein coding (see Tominaga et al., 1992, two adjacent columns is a single base shift in the positioning 
Critical Reviews in Oncogenesis 3: 257-282, incorporated of the probes. Whenever possible, the series of 10-mer 
herein by reference). The gene produces a 53 kilodalton probes were placed in four rows immediately underneath 
phosphoprotein that regulates DNA replication. The protein and aligned with the 4 rows of 12-mer probes for the same 
acts to halt replication at the Gl/S boundary in the cell cycle 55 mutation. 

and is believed to act as a "molecular policeman," shutting To provide model targets, 5' fluoresceinated 12-mers 
down replication when the DNA is damaged or blocking the containing all possible substitutions in the first position of 
reproduction of DNA viruses (see Lane, 1992, Nature 358: codon 192 were synthesized (see the starred position in the 
15-16, incorporated herein by reference). There is substan- target in FIG. 20). Solutions containing 10 nM target DNA 
tial interest in the cancer research community in analyzing 60 in 6xSSPE, 0.25% Triton X-100 were hybridized to the chip 
p53 mutations. The NCI is currently funding contracts to at room temperature for several hours. While target nucleic 
characterize the p53 mutation spectra caused by various was hybridized to the chip, the fluorophores on the chip were 
carcinogens. In addition, there are research projects which excited by light from an argon laser, and the chip was 
involve sequencing p53 from spontaneously arising tumors. scanned with an autofocusing confocal microscope. The 
A major resource in these studies is the huge supply of 65 emitted signals were processed by a PC to produce an image 
biopsy material stored in paraffin blocks. Also, there are using image analysis software. By 1 to 3 hours, the signal 
projects which are aimed at analyzing the relationship had reached a plateau; to remove the hybridized target and 
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allow hybridization to another target, the chip was stripped 
with 60% formamide, 2xSSPE at 17° C. for 5 minutes. The 
washing buffer and temperature can vary, but the buffer 
typically contains 2-to-3xSSPE, 10-to-60% formamide (one 
can use multiple washes, increasing the formamide concen- 
tration by 10% each wash, and scanning between washes to 
determine when the wash is complete), and optionally a 
small percentage of Triton X-100, and the temperature is 
typically in the range of 15° to 18° C. 

Very distinct patterns were observed after hybridization 
with targets with 1 base substitutions and visualization with 
a confocal microscope and software analysis, as shown in 
FIG. 22. In general, the probes which form perfect matches 
with the target retain the highest signal. For example, in the 
first image in Figure PC, the 12-mer probes that form perfect 
matches with the wild-type (WT) target are in the first row 
(top). The 12-mer probes with single base mismatches are 
located in the second, third, and fourth rows and have much 
lower signals. The data is also depicted graphically in FIG. 
23. On each graph, the X ordinate is the position of the probe 
in its row on the chip, and the Y ordinate is the signal at that 
probe site after hybridization. 

When a target with a different one base substitution is 
hybridized the complementary set of probes has the highest 
signal (see pictures 2, 3, and 4 in FIG. 22 and graphs 2, 3, 
and 4 in FIG. 23). In each case, the probe set with no 
mismatches with the target has the highest signals. Within a 
12-mer probe set, the signal was highest at position 6 or 7. 
The graphs show that the signal difference between 12-mer 
probes at the same X ordinate tended to be greatest at 
positions 5 and 8 when the target and the complementary 
probes formed 10 base pairs and 11 base pairs, respectively. 
Because tumors often have both WT and mutant p53 genes, 
mixed target populations were also hybridized to the chip, as 
shown in FIG. 24. When the hybridization solution consisted 
of a 1:1 mixture of WT 12-mer and a 12-mer with a 
substitution in position 7 of the target, the sets of probes that 
were perfectly matched to both targets showed higher sig- 
nals than the other probe sets. 

The hybridization efficiency of a 10-mer probe array as 
compared to a 12-mer probe array was also compared. The 
10-mer and 12-mer probe arrays gave comparable signals 
(see graphs 1-4 in FIG. 23 and graphs 1-A in FIG. 25). 
However, the 10-mer probe sets, which are in rows 5-8 (see 
images in FIG. 22), seemed to be better in this model system 
than the 12-mer probe sets at resolving one target from 
another, consistent with the expectation that one base mis- 
matches are more destabilizing for 10-mers than 12-mers. 
Hybridization results within probe sets perfectly matched to 
target also followed the expectation that, the more matches 
the individual probe formed with the target, the higher, the 
signal. However, duplexes with two 3' dangles (see FIG. 23, 
position 6 in graphs 1-4) have about as much signal as the 
probes which are matched along their entire length (see FIG. 
23, position 7, in graphs 1-4). 

This illustrative model system shows that 12-mer targets 
that differ by one base substitutions can be readily distin- 
guished from one another by the novel probe array provided 
by the invention and that resolution of the different 12-mer 
targets was somewhat better with the 10-mer probe sets than 
with the 12-mer probe sets. The value of having several 
overlapping probes hybridizing to a target demonstrates the 
value of the multiple hybridization events that take place on 
a DNA chip of the invention. The results also demonstrate 
the feasibility of constructing a probe set to sequence the 
entire 1.4 kbp protein coding region of p53 or alternatively 
the 0.6 kbp of exons 5-9 containing mutation hot spots. 
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For sequencing, the p53 DNA can be cloned from the 
sample or directly amplified from genomic DNA by PCR. If 
genomic PCR is used, then the DNA can be diluted prior to 
amplification so that a single copy of the gene is amplified. 
For diagnostic purposes, the genomic DNA can be isolated 
from a tumor biopsy in which the tumor cells may be the 
majority population. As noted above, the proportion of 
tumor cells in a sample can be enriched by cryostat section- 
ing. DNA can also be isolated and amplified from tumor 
samples stored in paraffin blocks. 

The p53 DNA in the sample can be amplified by PCR 
(although other amplification methods can be used) using 
3-4 primer pairs generating amplicons of <3 kbp each. 
Illustrative primers of the invention for amplifying exon 5 of 
the p53 gene are shown below (B is biotin; F is fluorescein). 
S'-B-CACTTGTGCCCTGACTTTCAACOXSEQ. ID 
NO:288) 

S'-F-CACnTCTGCCCTGACTrTCAAC-S' 
S'-ATGCAATTAACCCTCACTAAAGGGAG ACACTTG- 
TGCCCTGACTTTCAAC-3(SEQ. ID NO:289) (has 13 
promoter) 

5'-B-GACCCTGGGCAACCAGCCCTGTCGT-3'(SEQ. ID 
NO:290) 

S'-F-GACCCTGGGCAACCAGCCCTGTCGW 
5'-TAATACGACTCACTATAG G G AGG ACCCTG GG CA- 
ACCAGCCCTGTCGT-3'(SEQ. ID NO:291) (has T3 
promoter) 

After PCR amplification of the target (the amplified target is 
called the "amplicon") one strand of the amplicon can then 
be isolated, i.e., using a biotinylated primer that allows., 
capture of the undesired strand on streptavidin beads. 
Alternatively, asymmetric PCR can. be. used to generate a 
single-stranded target. Another approach involves the gen- 
eration of single stranded RNA form the PCR product by 
incorporating a T7 or other RNA polymerase promoter in 
one of the primers. The single-stranded material can option- 
ally be fragmented to generate smaller nucleic acids with 
less significant secondary structure than longer nucleic 
acids. 

In one such method, fragmentation is combined with 
labeling. To illustrate, degenerate 8-mers or other degenerate 
short oligonucleotides are hybridized to the single -stranded 
target material. In the next step, a DNA polymerase is added 
with the four different dideoxynucleolides, each labeled with 
a different fluorophore. Fluorophore-labeled dideoxynucle- 
otide are available from a variety of commercial suppliers, 
such as ABI. Hybridized 8-mers are extended by a labeled 
dideoxynucleotide. After an optional purification step, i.e., 
with a size exclusion column, the labeled 9-mers are hybrid- 
ized to the chip. Other methods of target fragmentation can 
be employed. The single-stranded DNA can be fragmented 
by partial degradation with a DNAse or partial depurination 
with acid. Labeling can be accomplished in a separate step, 
i.e., fluorophore-labeled nucleotides are incorporated before 
the fragmentation step or a DNA binding fluorophore, such 
as ethidium homodimer, is attached to the target after 
fragmentation. 

In one embodiment, the DNA chip has an array of 10 4 to 
10 5 probes tiling across the protein coding regions of p53, 
which comprise about 1200 bp; smaller arrays specific for 
the 600 bp mutational hot spot region are also useful. The 
probes overlap for N-2 to N-4 bases, where N is the length 
of the probe in bases. N is typically 10 to 14 bases long, but 
as will be seen below, probes 15 to 19 bases and longer are 
also useful. Every possible single base substitution occur- 
ring one at a time is represented in the array. The number of 
unique 10-mer probes with 7 base overlaps would be about 
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(1200/3)x4xl0 or about 1.6x10*. To allow 3 replicates of of DNA First, the target DNA is amplified by PCR with 

each probe, one might have a total array size on the order of primers allowing easy ligation into a vector, which is taken 

4.8xl0 4 probes. Of course, arrays of probes within the up by transformation of E. coli which in turn must be 

ranges of 10 2 to 10 s probes are also useful for applications; cultured, typically on plates overnight. After growth of the 
for example, very large arrays of 10 s or more probes are 5 bacteria, DNA is purified in a procedure that typically takes 

useful for sequencing or sequence checking large genomic about 2 hours; then, the sequencing reactions are performed, 

DNA fragments. Optionally fragmented and labeled target wnicn ,akes at least another hour, and the samples are run on 

nucleic acid hybridized to the chip is detected by a confocal lhe S el for several hours > the du «'«°n depending on the 

microscope or other imaging device. The pattern of sites len 8 ,h ? f me .fragment to be sequenced. By contrast, the 
"lighting up" with target is preferably analyzed with com- 10 P«sent mvenUon provxdes direct analysis of the PCR ampU- 

puter assistance to provide the sequence of the target from ficd mate . nal after bnef ^nption and fragmentation 

the pattern of sites producing signals. steps, sav,ng days of time and labor 

The invention is illustrated below with examples of DNA An interesting clinical application for the characterization 

chips comprising very large arrays of DNA probes to "rese- f heterozygous mutations with DNA chips is as follows 
quence" p53 target nucleic acid in a sample. To analyze is I^iduals with germhne cancer mutations have a very high 

DNA from exon 5 of the p53 tumor suppressor gene, a set ^^f^ ' Um0rS treEtment by '™ d,atl ° n - 

of overlapping 17-mer probes was synthesized on a chip. Abou ' 10 * of all , caD< f P a,,ents may have g«=rmhne muta- 

The probes for the WT allele were synthesized so as to tile " 0n . s ' 0r p53 ° r 0lher *mor suppressor genes. Thus, before 

across the entire exon with single base overlaps between deC1 u d " g on . a ^T™ 1 mod ^' a . could u * the 

probes. For each WT probe, a sets of 4 additional probes, 20 meth ? d ind DNA chlps of tbe mvenhon ,0 ,est for a 

one for each possible base substitution at position 7, were ^hne suppressor gene mutation, 

synthesized and placed in a column relative to the WTprobe. ChipS for . Rah ° naI TherapeutK Management 

Exon 5 DNA was amplified by PCR with primers flanking u ^ P«sent invention also provides DNA chips that can 

the exon. One of the primers was labeled with fluorescein! be used , by phy f 1Cia ™ * det '™™ optimum therapeutic 
the otherprimerwas labeled with biotin.After amplification, 25 9******* ^ ra P ld detcctl ? n of biologically mediated 

the biotinylated strand was removed by binding to strepta- «sistance toa therapeuUc agentma vanetyofd^asestates 

vidin beads. The fluoresceinated strand was used in hybrid- V 1 ? of such ™A clups are many, as the chips s will 

izuioQ help physicians recognize health care cost savings, achieve 

About V, of the amplified, single-stranded nucleic acid , rapid the f a P eut . ic ben «; fits . " mil administration of ineffective 

was hybridized overnight in 5xSSPE at 60° C. to the probe 30 t0 ,be resistance ) 3»t tone drugs monitor changes m 

chip (under a cover slip). After washing with 6xSSPE, the P ath °S ea ^.stance, and decrease pathogen acquisiUon of 

chip was scanned using confocal microscopy. FIG. 26 shows J^ff 6 ' ^T^' T n * „ mdude ^ XteiUDeal of 

an image of the p53 chip hybridized to the target DNA HIV other infectious diseases, and cancer. ^ 

Analysis of the intensity data showed that 93.5% of the 184 ™ V has ulfected a lar 8 e , a ° d "P^g «"«"*« of people, 

bases of exon 5 were called in agreement with the WT 35 m massive health care "penditures. HIV can 

sequence (see Buchman et al., 1988, Gene 70: 245-252, rapldly , bec , ome re u SIS,an ! t0 f ed t0 5 eat the mfectl ° n ' 

incorporated herein by reference). The miscalled bases were f^l^St"* 0 * °[ ,he he . te , rod,m 3 e " c ^TR 1 

from positions where probe signal intensities were tied ^ "?> HIV reverse transenptase (RT) encoded by 

(1.6%) and where non-WT probes had the highest signal he " kb f? 1 ? e " e ^ crror [f <t™ P " 'Tu ° 

intensity (4.9%). FIG. 27 illustrates how the actual sequence 40 ^JEfT™ ,0 accoun ! for ^E™!**"* 

was read. Gaps in the sequence of letters in the WT rows °™ IV ^ ™ cle °» de analo g u ^: 'f- ^ ddI - ddC > and 

correspond to control probes or sites. Positions at which d4 *' comm only <° ««*t HIV infection are converted to 

bases are miscalled are represented by letters in italic type in n,lcle ° tlde Rogues by sequential phosphorylation in the 

cells corresponding to probes in which the WT base/have cvt °P lasm . ° f w °f « incorporation of the 

been substituted by other bases. 45 aDa 1 1 . 0gu . e T ,he v,ral °NA "salts in termination of viral 

As the diagram indicates, the miscalled bases are from the "pk""». because the 5 -3' phosphodiester linkage can- 
low intensity areas of the image, which may be due to D ° t , be completed However, within after 6 months to 1 year 
secondary structure in the target or probes preventing inter- ° f treatment - H ' V mu,a,es ,he ™ f CDe 50 » t0 
molecular hybridization. To diminish the effects due to bec . ome mca P able of incorporating the analogue and so 
secondary structure, one can employ shorter targets (i.e., by 50 ^P"? l ° ' trea ' ment - Several known mutations are shown 
target fragmentation) or use more stringent hybridization ,n taDular ■ below - 
conditions. In addition, the use of a set of probes synthesized 
by tiling across the other strand of a duplex target can also 
provide sequence information buried in secondary structure 
in the other strand. It should be appreciated, however, that 55 anit- 

the pattern of low intensity areas that forms as a result of vthal codon aa change m change 
secondary structure in the target itself provides a means to 
identify that a specific target sequence is present in a sample. 
Other factors that may contribute to lower signal intensities 
include differences in probe densities and hybridization 60 
stabilities. 

These results demonstrate the advantages provided by the 
DNA chips of the invention to genetic analysis. As another 

example, heterozygous mutations are currently sequenced M _ ,. ~ r ~. '. i '■ '. 

. , .•'?„. N.B. olher mutations confer resistance to other drugs in vitro 

by an arduous process involving cloning and repurification 65 

of DNA The cloning step is required, because the gel The present invention provides DNA chips for detecting 

sequencing systems are poor at resolving even a 1:1 mixture the multiple mutations in the HIV RT gene associated with 



RT MUTATIONS ASSOCIATED WITH DRUG RESISTANCE 
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67 


Asp 


-> Asn 


GAC 


-> AAC 


ACT 


70 


Lys 


-> Arg 


AAA 


-> AGA 


AZT 


215 


Thr 


-> Phe or Tyr 


ACC 


-> TTC or TAC 


AZT 


219 


Lys 


-> Gin or Glu 


AAA 


-> CAA or GAA 


AZT 
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Met 


-> Leu 


ATG 


-> TTG or CTG 


ddl and ddC 
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Met 


-> Val 


ATG 


-> GTG 


ddl and ddC 


74 


Leu 


-> Val 






TOO 82150 


100 


Leu 


-> Qe 
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resistance to different therapeutics. These DNA chips will 
enable physicians to monitor mutations over time and to 
change therapeutics if resistance develops. The DNA chip 
will provide redundant confirmation of conserved HIV RT 
and other gene sequences, and the probes on the chip will tile 
through, with overlap, in important mutational hot spot 
regions. The chip will optionally have probes that span the 
entire coding region of the RT and optionally the genes for 
other HIV proteins, such as coat proteins. HIV target nucleic 
acid can be isolated from blood samples (peripheral blood 
lymphocytes or PBMC) and amplified by PCR, primers for 
which are shown in the table below. 
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to gain primary structure information of the DNA target. 
This format has important applications in sequencing by 
hybridization, DNA diagnostics and in elucidating the ther- 
modynamic parameters affecting nucleic acid recognition. 

Conventional DNA sequencing technology is a laborious 
procedure requiring electrophbretic size separation of 
labeled DNA fragments. An alternative approach, termed 
Sequencing By Hybridization (SBH), has been proposed 
(Lysovetal., 1988,DokLAkad. NaukSSSR 303: 1508-1511; 
Bains et al., 1988, /. Theor. Biol. 135: 303-307; and 
Drmanac et al., 1989, Genomics 4: 114-128, incorporated 
herein by reference). This method uses a set of short 





AMPLIFICATION OF TARGET 


TARGET 






SIZE 


PRIMER 1 


PRIMER 2 


1, 742bp 


GTAGAATTCTGTTGACTCAGATTGG 


GATAAGCITGGGCCITATCTATrCCAT 


(SEQ ID. NO:292) 


(SEQ ID. NO:294) 


535bp 


AAATCCATACAATACTCCAGTATTTGC 


ACCCATCCAAAGGAATGGAGGTTCTTTC 


(SEQ ID. NO:293) 


(SEQ ID. NO:295) 


323bp 


Genbank#K02013 1889-1908 


bases 2211-2192 



The HIV RT gene chips of the invention, as well as the CF, 
mtDNA, and p53 DNA chips of the invention, illustrate the 
diverse application of the methods and probe arrays of the 
invention. The examples that follow describe methods for 
preparing nucleic acid targets from samples for application 
to the DNA chips of the invention and provide additional 
details of the methods of the invention. 

EXAMPLES 
I. VLSIPS™ Technology 

As noted above, the VLSIPS™ technology is described in 
a number of patent publications and is preferred for making 
the oligonucleotide arrays of the invention. For 
completeness, a brief description of how this technology .can 
be used to make and screen DNA chips is provided in this 
Example and the accompanying Figures. In the VLSIPS 
method, light is shone through a mask to activate functional 
(for oligonucleotides, typically an — OH) groups protected 
with a photoremovable protecting group on a surface of a 
solid support. After light activation, a nucleoside building 
block, itself protected with a photoremovable protecting 
group (at the 5' — OH), is coupled to the activated areas of 
the support. The process can be repeated, using different 
masks or mask orientations and building blocks, to prepare 
very dense arrays of many different oligonucleotide probes. 
The process is illustrated in FIG. 28; FIG. 29 illustrates how 
the process can be used to prepare "nucleoside combinato- 
rials" or oligonucleotides synthesized by coupling all four 
nucleosides to form dimers, trimers, etc. 

New methods for the combinatorial chemical synthesis of 
peptide, polycarbamate, and oligonucleotide arrays have 
recently been reported (see Fodor et al, 1991, Science 251: 
767-773; Cho et al., 1993, Science 261: 1303-1305; and 
Southern et al., 1992, Genomics 13: 1008-10017, each of 
which is incorporated herein by reference). These arrays, or 
biological chips (see Fodor et al., 1993, Nature 364: 
555-556, incorporated herein by reference), harbor specific 
chemical compounds at precise locations in a high-density, 
information rich format, and are a powerful tool for the 
study of biological recognition processes. A particularly 
exciting application of the array technology is in the field of 
DNA sequence analysis. The hybridization pattern of a DNA 
target to an array of shorter oligonucleotide probes is used 



oligonucleotide probes of defined sequence to search for 
complementary sequences on a longer target strand of DNA. 
The hybridization pattern is used to reconstruct the target 
DNA sequence. It is envisioned that hybridization analysis 
of large numbers of probes can be used to sequence long 

30 stretches of DNA. In immediate applications of this hybrid- 
ization methodology, a small number of probes can be used 
to interrogate local DNA sequence. 

The strategy of SBH can be illustrated by the following 
example. A 12-mer target DNA sequence, 

35 AGCCTAGCTGAA, (SEQ. ID NO:296) is mixed with a 
complete set of octanucleotide probes. If only perfect 
complementarity is considered, five of the 65,536 octamer 
probes -TCGGATCG, CGGATCGA, GGATCGAC, 
GXTCGACT, and ATCGACTT will hybridize to the target. 

40 Alignment of the overlapping sequences from the hybridiz- 
ing probes reconstructs the complement of the original 
12-mer target: 



TCGGATCG 
45 CGGATCGA 

GGATCGAC 
GATCGACT 
ATCGACTT 
TCGG ATCGACTT (SEQ. ID NO:297) 

Hybridization methodology can be carried out by attaching 
target DNA to a surface. The target is interrogated with a set 
of oligonucleotide probes, one at a time (see Strezoska et al., 
1991, Proc. Natl Acad. Sci. USA 88: 10089-10093, and 

55 Drmanac et al., 1993, Science 260: 1649-1652, each of 
which is incorporated herein by reference). This approach 
can be implemented with well established methods of immo- 
bilization and hybridization detection, but involves a large 
number of manipulations. For example, to probe a sequence 

60 utilizing a full set of octanucleotides, tens of thousands, of 
hybridization reactions must be performed. Alternatively, 
SBH can be carried out by attaching probes to a surface in 
an array format where the identity of the probes at each site 
is known. The target DNA is then added to the array of 

65 probes. The hybridization pattern determined in a single 
experiment directly reveals the identity of all complemen- 
tary probes. 
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As noted above, a preferred method of oligonucleotide of the probes will generate detectable signals. Modifying the 

probe array synthesis involves the use of light to direct the above expression for N, one arrives at a relationship esti- 

synthesis of oligonucleotide probes in high-density, minia- mating the number of detectable hybridizations (Nd) for a 

turized arrays. Photolabile 5'-protected N-acyl- DNA target of length Lt and an array of complexity C. 

deoxynucleoside phosphoramidites, surface linker 5 Assuming an average of 5 positions giving signals above 

chemistry, and versatile combinatorial synthesis strategies background: Nd«(l+5(C-l))[Lt-(Lp-l)].. 

have been developed for this technology. Matrices of Arrays of oligonucleotides can be efficiently generated by 

spatially-defined oligonucleotide probes have been light-directed synthesis and can be used to determine the 

generated, and the ability to use these arrays to identify identity of DNA target sequences. Because combinatorial 

complementary sequences has been demonstrated by 10 strategies are used, the number of compounds increases 

hybridizing fluorescent labeled oligonucleotides to the DNA exponentially while the number of chemical coupling cycles 

chips produced by the methods. The hybridization pattern increases only linearly. For example, expanding the synthe- 

demonstrates a high degree of base specificity and reveals sis to the complete set of 4 8 (65,536) octanucleotides will 

the sequence of oligonucleotide targets. add only four hours to the synthesis for the 16 additional 

The basic strategy for light-directed oligonucleotide syn- 15 cycles. Furthermore, combinatorial synthesis strategies can 

thesis (1) is outlined in FIG. 28. The surface of a solid be implemented to generate arrays of any desired composi- 

support modified with photolabile protecting groups (X) is tion. For example, because the entire set of dodecamers (4 12 ) 

illuminated through a photolithographic mask, yielding can be produced in 48 photolysis and coupling cycles (b n 

reactive hydroxyl groups in the illuminated regions. A compounds requires bxn cycles), any subset of the dodecam- 

3-O-phosphoramidite activated deoxynucleoside (protected 20 ers (including any subset of shorter oligonucleotides) can be 

at the 5'-hydroxyI with a photolabile group) is then presented constructed with the correct lithographic mask design in 48 

to the surface and coupling occurs at sites that were exposed or fewer chemical coupling steps. In addition, the number of 

to light. Following capping, and oxidation, the substrate is compounds in an array is limited only by the density of 

rinsed and the surface illuminated through a second mask, to synthesis sites and the overall array size. Recent experi- 

expose additional hydroxyl groups for coupling. A second 25 ments have demonstrated hybridization to probes synthe- 

5'-protected, 3'-0-phosphoramidite activated deoxynucleo- sized in 25 /ma sites. At this resolution, the entire set of 

side is presented to the surface. The selective photodepro- 65,536 octanucleotides can be placed in an array measuring 

tection and coupling cycles are repeated until the desired set 0.64 cm square, and the set of 1,048,576 dodecanucleotides 

of products is obtained. requires only a 2.56 cm array. 

. Light directed chemical synthesis lends itself to highly 30 Genome sequencing projects will ultimately be limited by 
efficient synthesis strategies which will generate a maximum DNA sequencing technologies. Current sequencing method- 
number of compounds in a minimum number of chemical ologies are highly reliant on complex procedures and require 
steps. For example, the complete set of 4n polynucleotides substantial manual effort. Sequencing by hybridization has 
(length n), or any subset of this set can be produced in only the potential for transforming many of the manual efforts 
4xn chemical steps. See FIG. 29. The patterns of illumina- 35 into more efficient and automated formats. Light-directed 
tion and the order of chemical reactants ultimately define the synthesis is an efficient means for large scale production of 
products and their locations. Because photolithography is miniaturized arrays for SBH. The oligonucleotide arrays are 
used, the process can be miniaturized to generate high- not limited to primary sequencing applications. Because 
density arrays of oligonucleotide probes. For an example of single base changes cause multiple changes in the hybrid- 
the nomenclature useful for describing such arrays, an array 40 ization pattern, the oligonucleotide arrays provide a power- 
containing all possible octanucleotides of dA and dT is ful means to check the accuracy of previously elucidated 
written as (A+T) 8 . Expansion of this polynomial reveals the DNA sequence, or to scan for changes within a sequence. In 
identity of all 256 octanucleotide probes from AAAAAAAA the case of octanucleotides, a single base change in the target 
to Tl 1 TiTlT. A DNA array composed of complete sets of DNA results in the loss of eight complements, and generates 
dinucleotides is referred to as having a complexity of 2. The 45 eight new complements. Matching of hybridization patterns 
array given by (A+T+C+G)8 is the full 65,536 octanucle- may be useful in resolving sequencing ambiguities from 
otide array of complexity four. standard gel techniques, or for rapidly detecting DNA muta- 
To carry out hybridization of DNA targets to the probe tional events. The potentially very high information content 
arrays, the arrays are mounted in a thermostatically con- of light-directed oligonucleotide arrays will change genetic 
trolled hybridization chamber. Fluorescein labeled DNA 50 diagnostic testing. Sequence comparisons of hundreds to 
targets are injected into the chamber and hybridization is thousands of different genes will be assayed simultaneously 
allowed to proceed for ¥i to 2 hours. The surface of the instead of the current one, or few at a time format. Custom 
matrix is scanned in an epifluorescence microscope (Zeiss arrays can also be constructed to contain genetic markers for 
Axioscop 20) equipped with photon counting electronics the rapid identification of a wide variety of pathogenic 
using 50-100 /<W of 488 nm excitation from an Argon ion 55 organisms. 

laser (Spectra Physics model 2020). All measurements are Oligonucleotide arrays can also be applied to study the 

acquired with the target solution in contact with the probe sequence specificity of RNA or protein-DNA interactions, 

matrix. Photon counts are stored and image files are pre- Experiments can be designed to elucidate specificity rules of 

sented after conversion to an eight bit image format. See non Watson-Crick oligonucleotide structures or to invest! - 
FIG. 33. 60 gate the use of novel synthetic nucleoside analogs for 

When hybridizing a DNA target to an oligonucleotide antisense or triple helix applications. Suitably protected 

array, N«Lt-(Lp-l) complementary hybrids are expected, RNA monomers may be employed for RNA synthesis. The 

where N is the number of hybrids, Lt is the length of the oligonucleotide arrays should find broad application deduc- 

DNA target, and Lp is the length of the oligonucleotide ing the thermodynamic and kinetic rules governing forma- 
probes on the array. For example, for an 11-mer hybridized 65 tion and stability of oligonucleotide complexes, 

to an octanucleotide array, N-4. Hybridizations with mis- Other than the use of photoremovable protecting groups, 

matches at positions that are 2 to 3 residues from either end the nucleoside coupling chemistry is very similar to that 
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used routinely today for oligonucleotide synthesis. FIG. 30 
shows the deprotection, coupling, and oxidation steps of a 
solid phase DNA synthesis method. FIG. 31 shows an 
illustrative synthesis route for the nucleoside building blocks 
used in the method. FIG. 32 shows a preferred photoremov- 
able protecting group, MeNPOC, and how to prepare the 
group in active form. The procedures described below show 
how to prepare these reagents. The nucleoside building 
blocks are 5'-MeNPOC-THYMIDINE-3*-OCEP; 
5 , -MeNPOC-N 4 -t-BUTYL PHENOXYACETYL- 
DEOXYCYTIDINE-^-OCEP; 5*-MeNPOC-N 4 -t-BUTYL 
PHENOXYACETYL-DEOXYGUANOSINE-3'-OCEP; 
and 5'-MeNPOC-N 4 -t-BUTYL PHENOXYACETYL- 
DEOXYADENOSINE-3'-OCEP. 

A- Preparation of 4, 5-methylenedioxy-2-nitroacetophenone 




30 



minimum volume of CHjClj or THF(-175 ml) and then 
precipitating it by slowly adding hexane (1000 ml) while 
stirring (yield 51 g; 80% overall). It can also be recrystal- 
lized (eg., toluene- hexane), but this reduces the yield. 
C. Preparation of l-(4,5- methylenedioxy-2-nitrophenyl) 
ethyl chloroformate (MeNPOC-Cl) . 



N0 2 



OH 



10 



15 




COCI2 
Tolucnc/THF^ 



V_ o 



20 




A solution of 50 g (0.305 mole) 3,4- 
methylenedioxyacetophenone (Aldrich) in 200 mL glacial 
acetic acid was added dropwise over 30 minutes to 700 mL 
of cold (2-4° C.) 70% HNO3 with stirring (NOTE: the 
reaction will overheat without external cooling from an ice 
bath, which can be dangerous and lead to side products). At 
temperatures below 0° C, however, the reaction can be 
sluggish. A temperature of 3°-5° C. seems to be optimal). 
The mixture was left stirring for another 60 minutes at 3°-5° 
C, and then allowed to approach ambient temperature. 
Analysis by TLC (25% EtOAc in hexane) indicated com- 
plete conversion of the starting material within 1-2 nr. When 
the reaction was complete, the mixture was poured into -3 
liters of crushed ice, and the resulting yellow solid was 
filtered off, washed with water and then suction-dried. Yield 
-53 g (84%), used without further purification. 
B. Preparation of l-(4,5-Methylenedioxy-2-nitrophenyl) 
ethanol 




25 Phosgene (500 mL of 20% w/v in toluene from Fluka: 965 
mmole; 4 eq.) was added slowly to a cold, stirring solution 
of 50 g (237 mmole; 1 eq.) of l-(4,5-methylenedioxy-2- 
nitrophenyl)ethanol in 400 mL dry THE The solution was 
stirred overnight at ambient temperature at which point TLC 

30 (20% E^CVhexane) indicated >95% conversion. The mix- 
ture was evaporated (an ouMess pump with, downstream 
aqueous NaOH trap is recommended to remove the excess 
phosgene) to afford a viscous brown oil. Purification was 
effected by flash chromatography on a short (9x13 cm) 

35 column of silica gel eluted with 20% Et 2 0/hexane. Typically 
55 g (85%) of the solid yellow MeNPOC-Cl is obtained by 
this procedure. The crude material has also been recrystal- 
lized in 2-3 crops from 1:1 ether/he xane. On this scale, -100 
ml is used for the first crop, with a few percent THF added 

40 to aid dissolution, and then cooling overnight at -20° C. (this 
procedure has not been optimized). The product should be 
stored dessicated at -20° C. 

D. Synthesis of S'-MeNPOC-T-DEOXYNUCLEOSIDE-S'- 
(N,N-DIISOPROPYL 2-CYANO ETHYL PHOSPHORA- 
45 MIDITES 

(1) 5'-MeNPOC-Nucleosides 



Sodium borohydride (10 g; 0.27 mol) was added slowly 
to a cold, stirring suspension of 53 g (0.25 mol) of 4,5- 
methylenedioxy-2-nitroacetophenone in 400 mL methanol. 
The temperature was kept below 10° C. by slow addition of 
the NaBH 4 and external cooling with an ice bath. Stirring 
was continued at ambient temperature for another two hours, 
at which time TLC (CH 2 C[^) indicated complete conversion 
of the ketone. The mixture was poured into one liter of 
ice-water and the resulting suspension was neutralized with 
ammonium chloride and then extracted three times with 400 
mL CH 2 C1 2 or EtOAc (the product can be collected by 
filtration and washed at this point, but it is somewhat soluble 
in water and this results in a yield of only -60%). The 
combined organic extracts were washed with brine, then 
dried with MgS0 4 and evaporated. The crude product was 
purified from the main byproduct by dissolving it in a 
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McnpocO' 



55 
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HO 



60 
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Base -THYMIDINE (T); N-4-ISOBUTYRYL 
2'-DEOXYCYTIDINE (ibu-dC); N-2-PHENOXYACETYL 
2'DEOXYGUANOSINE (PAC-dG); and N-6- 
PHENOXYACETYL 2'DEOXYADENOSINE (PAC^A) 

All four of the 5-MeNPOC nucleosides were prepared 
from the base-protected 2'-deoxynucleosides by the follow- 
ing procedure. The protected 2 f -deoxynucleoside (90 
mmole) was dried by co-evaporating twice with 250 mL 
anhydrous pyridine. The nucleoside was then dissolved in 
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300 mL anhydrous pyridine (or 1:1 pyridine/DMF, for the For products in the 200 to 1000 bp size range, check 2 pi 

dG^ nucleoside) under argon and cooled to -2° C. in an of the reaction on a 1.5% 0.5xTBE agarose gel using an 

ice bath. A solution of 24.6 g (90 mmole) MeNPOC-Cl in appropriate size standard (phiXI74 cut with Haein is 

100 mL dry THP was then added with stirring over 30 convenient). The PCR reaction should yield several pico- 
minutes. The ice bath was removed, and the solution allowed 5 moles of product. It is helpful to include a negative control 

to stir overnight at room temperature (TLC: 5-10% MeOH (i.e., 1/dTE instead of genomic DNA) to check for possible 

in CHjCl^ two diastereomers). After evaporating the sol- contamination. To avoid contamination, keep PCR products 

vents under vacuum, the crude material was taken up in 250 from previous experiments away from later reactions, using 

mL ethyl acetate and extracted with saturated aqueous filter tips as appropriate. Using a set of working solutions 
NaHC0 3 and brine. The organic phase was then dried over 10 afl d storing master solutions separately is helpful, so long as 

N^SO^ filtered and evaporated to obtain a yellow foam. one does DOt contaminate the master stock solutions. 

The crude products were finally purified by flash chroma- For simple amplifications of short fragments from 

tography (9x30 cm silica gel column eluted with a stepped genomic DNA it is, in general, unnecessary to optimize 

gradient of 2%-6% MeOH in CH 2 C\J. Yields of the puri- M S * concentrations. A good procedure is the following: 

fied diastereomeric mixtures are in the range of 65-75%. a ma f er mix minus enzyme; dispense the genomic 

(2) 5*-MeNP0C-2*-DE0XYNUCLE0SIDE-3'-(N,N- DNA ""Pks to individual tubes or reaction wells; add 

DIISOPROPYL 2-C YA N O E T H Y L enzyme to the master mix; and mix and dispense the master 

PuncDuno Ax/inriT^ solution to each well, using a new filter tip each time. 

PHOSPHORAMIDITES) 2) PURIFICAn0N 

. „ Removal of unincorporated nucleotides and primers from 

Menpoco^^v ° v^ Basc diea^cm" 1 ^ 20 PCR samples can be accomplished using the Promega 

\ f Magic PCR Preps DNA purification kit. One can purify the 

\_/ whole sample, following the instructions supplied with the 

ho kit (proceed from section HIB, 'Sample preparation for 

direct purification from PCR reactions'). After elution of the 

Men cO^\^ ° \^ Basc 25 PCR P roduct 50 i" 1 of TO or H 20» one centrifuges the 

\^ jT eluate for 20 sec at 12,000 rpm in a microfuge and carefully 

\ / transfers 45 /d to a new microfuge tube, avoiding any visible 

q' pellet. Resin is sometimes carried over during the elution 

\ step. This transfer prevents accidental contamination of the 

P— OCHzCHjCN 30 linear amplification reaction with 'Magic PCR' resin. Other 

J methods, e.g. size exclusion chromatography, may also be 

v-^ Y used - 

11 3) LINEAR AMPLIFICATION 

In a 0.2 mL thin-wall PCR tube mix: 4 /d purified PCR 
The four deoxynucleosides were phosphitylated using 35 product; 2 fil primer (10 pmol//d); 4 p\ lOxPCR buffer, 4 /d 
either 2-cyanoethyl-N,N-diisopropyl dNTPs (2 mM dA, dC, dG, 0.1 mMdT);4/d0.1 mMdUTP; 

chlorophosphoramidite, or 2-cyanoethyl-N,N,N\N'- 1 u\ 1 mM fluorescein dUTP (Amersham RPN 2121); 1 U 
tetraisopropylphosphorodiamidite. The following is a typi- Taq polymerase (Perkin Elmer, 5 U//d); and add H 2 0 to 40 
cal procedure. Add 16.6 g (17.4 ml; 55 mmole) of pi Conduct 40 cycles (92° C 30 sec, 55° C. 30 sec, 72° C. 
2-cyanoethyl-N,N,N',N'-tetraisopropyIphosphorodiamidite 40 90 sec) of PCR. These conditions have been used to amplify 
to a solution of 50 mmole 5'-MeNPOC-nucleoside and 4.3 a 300 nucleotide mitochondrial DNA fragment but are 
g (25 mmole) diisopropylammonium tetrazolide in 250 mL generally applicable. Even in the absence of a visible 
dry CH 2 C1 2 under argon at ambient temperature. Continue product band on an agarose gel, there should still be enough 
stirring for 4-16 hours (reaction monitored by TLC: product to give an easily detectable hybridization signal. If 
45:45:10 hexane/CH 2 Cl 2 /Et 3 N). Wash the organic phase 45 one is not treating the DNA with uracil DNA glycosylase 
with saturated aqueous NaHC0 3 and brine, then dry over (see Section 4), dUTP can be omitted from the reaction. 
Na^O^ and evaporate to dryness. Purify the crude amidite 4) FRAGMENTATION 

by flash chromatography (9x25 cm silica gel column eluted Purify the linear amplification product using the Promega 
with hexane/O^CyTEA -45:45: 10 for A, C, T; or 0:90:10 Magic PCR Preps DNA purification kit, as per Section 2 
for G). The yield of purified amidite is about 90%. 50 above. In a 0.2 mL thin-wall PCR tube mix: 40 /d purified 

II. PREPARATION OF LABELED DNA/ labeled DNA; 4 /d lOxPCR buffer; and 0.5 pi uraciWDNA 
HYBRIDIZATION TO ARRAY glycosylase (BRL lV/pl). Incubate the mixture 15 min at 

1) PCR 37° C, then 10 min at 97° C; store at -20° C. until ready 

PCR amplification reactions, are typically conducted in a to use. 
mixture composed of per reaction: 1 pi genomic DNA; 10 pi 55 5) HYBRIDIZATION SCANNING & STRIPPING 
each primer (10 pmoV/d stocks); 10 /d lOxPCR buffer (100 A blank scan of the slide in hybridization buffer only is 
mM Tris.Cl pH8.5, 500 mM KC1, 15 mM MgClJ; 10 pi 2 helpful to check that the slide is ready for use. The buffer is 
mM dNTPs (made from 100 mM dNTP stocks); 2.5 U Taq removed from the flow cell and replaced with 1 mL of 
polymerase (Perkin Elmer AmpliTaq™, 5 U//d); and H 2 0 to (fragmented) DNA in hybridization buffer and mixed well. 
100 /ih The cycling conditions are usually 40 cycles (94° C 60 The scan is performed in the presence of the labeled target. 
45 sec, 55° C. 30 sec, 72° C. 60 sec) but may need to be FIG. 33 illustrates an illustrative detection system for scan- 
varied considerably from sample type to sample type. These ning a DNA chip. A series of scans at 30 min intervals using 
conditions are for 0.2 mL thin wall tubes in a Perkin Elmer a hybridization temperature of 25° C. yields a very clear 
9600 thermocycler. See Perkin Elmer 1992/93 catalogue for signal, usually in at least 30 min to two hours, but it may be 
9600 cycle time information. Target, primer length and 65 desirable to hybridize longer, i.e., overnight. Using a laser 
sequence composition, among other factors, may also affect power of 50 /zW and 50 pm pixels, one should obtain 
parameters. maximum counts in the range of hundreds to low thousands/ 
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pixel for a new slide. When finished, the slide can be 30 sec) are performed, but cycling conditions may need to 

stripped using 50% formamide. rinsing well in deionized be varied. These conditions are for 0.2 mL thin wall tubes in 

H 2 0, blowing dry, and storing at room temperature. Perkin Elmer 9600. For products in the 200 to 1000 bp size 

III. PREPARATION OF LABELED RNA range, check 2 /d of the reaction on a 1.5% 0.5xTBE agarose 

/HYBRIDIZATION TO ARRAY 5 gel using an appropriate size standard. For larger or smaller 

1) TAGGED PRIMERS . volumes (20-100 ^1), one can use the same amount of 

The primers used to amplify the target nucleic acid should genomic DNA but adjust the other ingredients accordingly, 

have promoter sequences if one desires to produce RNA 4) IN VITRO TRANSCRIPTION 

from the amplified nucleic acid. Suitable promoter Mix: 3 fid PCR product; 4 fid 5xbuffer; 2 fid DTT; 2.4 /d 10 

sequences are shown below and include: io mM rNTPs (100 mM solutions from Pharmacia); 0.48 ^1.10 

(1) the T3 promoter sequence: mM fluorescein -UTP (Fluorescein -12-UTP, 10 mM 
5 f -CGGAATTAACCCTCACTAAAGG (SEQ. ID NO:298) solution, from Boehringer Mannheim); 0.5 fid RNA poly- 
5'-AATTAACCCTCACTAAAGGGAG; (SEQ. ID NO: 299) merase (Promega T3 or T7 RNA polymerase); and add H 2 0 

(2) the T7 promoter sequence: to 20 /d. Incubate at 37° C. for 3 h. Check 2/d of the reaction 
5* TAATACGACTCACTATAGGGAG; (SEQ. ID NO:300) 15 on a 15% 0.5xTBE agarose gel using a size standard, 
and (3) the SP6 promoter sequence: Sxbuffer is 200 mM Tris pH 75, 30 mM MgCL^ 10 mM 
5* AnTAGGTGACACTATAGAA. (SEQ. ID NO:301) spermidine, 50 mM NaCl, and 100 mM DTT (supplied with 
The desired promoter sequence is added to the 5* end of the enzyme). The PCR product needs no purification and can be 
PCR primer. It is convenient to add a different promoter to added directly to the transcription mixture. A 20 fid reaction 
each primer of a PCR primer pair so that either strand may 20 is suggested for an initial test experiment and hybridization; 
be transcribed from a single PCR product. a 100 /d reaction is considered "preparative" scale (the 

Synthesize PCR primers so as to leave the DMT group on. reaction can be scaled up to obtain more target). The amount . 

DMT-on purification is unnecessary for PCR but appears to of PCR product to add is variable; typically a PCR reaction 

be important for transcription. Add 25 fid 0.5M NaOH to will yield several picomoles of DNA. If the PCR reaction 

collection vial prior to collection of oligonucleotide to keep 25 does not produce that much target, then one should increase 

the DMT group on. Deprotect using standard chemistry — the amount of DNA added to the transcription reaction (as, 

55° C. overnight is convenient. well as optimize the PCR). The ratio of fluorescein-UTP to 

HPLC purification is accomplished by drying down the UTP suggested above is 1:5, but ratios from 1:3 to 1:10— all 

oligonucleotides, resuspending in 1 mL 0.1 M TEAA (dilute work well. One can also label with biotin-UTP and detect 

2.0M stock in deionized water, filter through 0.2 micron 30 with streptavidin-FITC to obtain similar results as with 

filter) and filter through 0.2 micron filter. Load 0.5 mL on fluorescein-UTP detection. 

reverse phase HPLC (column can be a Hamilton PRP-1 For nondenaturing agarose gel electrophoresis of RNA, 

semi-prep, #79426). The gradient is 0-*50% CH 3 CN over note that the RNA band will normally migrate somewhat 

25 min (program 0.2 /nnol.prep.0-50, 25 min). Pool the faster man the DNA template band, although sometimes the 

desired fractions, dry down, resuspend in 200 /d 80% HAc. 35 two bands will comigrate. The temperature of the gel can 

30 min RT Add 200 /d EtOH; dry down. Resuspend in 200 effect the migration of the RNA band. The RNA produced 

fid H 2 0, plus 20 /d NaAc pH5.5, 600 jid EtOH. Leave 10 min from in vitro transcription is quite stable and can be stored 

on ice; centrifuge 12,000 rpm for 10 min in microfuge. Pour for months (at least) at -20° C. without any evidence of 

off supernatant. Rinse pellet with 1 mLEtOH, dry, resuspend degradation. It can be stored in unsterilized 6xSSPE 0.1% 

in 200 fid H20. Dry, resuspend in 200 fid TE. Measure A260, 40 triton X- 100 at -20° C. for days (at least) and reused twice 

prepare a 10 pmol/^1 solution in TE (10 mM Tris.Cl pH 8.0, (at least) for hybridization, without taking any special pre- 

0.1 mM EDTA). Following HPLC purification of a 42 mer, cautions in preparation or during use, RNase contamination 

a yield in the vicinity of 15 nmol from a 0.2 /imol scale should of course be avoided. When extracting RNA from 

synthesis is typical. cells, it is preferable to work very rapidly and to use strongly 

2) GENOMIC DNA PREPARATION 45 denaturing conditions. Avoid using glassware previously 
For obtaining genomic DNA from human hair, one can contaminated with RNases. Use of new disposable plas- 

extract as few as 5 hairs, including hair roots. On a clean and ticware (not necessarily sterilized) is preferred, as new 

sterile surface, one places the hair on a piece of parafilm, and plastic tubes, tips, etc., are essentially RNase free. Treatment 

after wiping a new razor blade with EtOH cutting off the with DEPC or autoclaving is typically not unnecessary, 

roots, the roots are transferred to a 1.5 mL microfuge tube 50 5) FRAGMENTATION 

using a pair of Millipore forceps cleaned with EtOH. Add In a 0.2 mL thin-wall PCR tube mix: 18 fid RNA (direct 

500 /d (10 mM Tris.Cl pH8.0, 10 mM EDTA, 100 mM NaCl, from transcription reaction— no purification required); *18 fill 

2% (w/v) SDS, 40 mM DTT, filter sterilized) to the sample. H 2 0; and 4 fill 1M Tris.Cl pH9.0. Incubate at 99.9° C. for 60 

Add 1.25 fil 20 mg/ml proteinase K (Boehringer) Incubate at min. Add to 1 mL hybridization buffer and store at -20° C. 

55° C for 2 hours, vortexing once or twice. Perform 2x0.5 55 until ready to use. The alkaline hydrolysis step is very 

mL 1:1 phenol:CHCl 3 extractions. After each extraction, reliable. The hydrolysed target can be stored at —20° C, in 

centrifuge 12,000 rpm 5 min in a microfuge and recover 0.4 6xSSPE/0.1% Triton X-100 for at least several days prior to 

mL supernatant. Add 35 fid NaAc pH5.2 plus 1 mL EtOH. use and can also be reused. 

Place sample on ice 45 min; then centrifuge 12,000 rpm 30 6) HYBRIDIZATION SCANNING, & STRIPPING 

min, rinse, air dry 30 min, and resuspend in 100 /d TE. 60 A blank scan of the slide in hybridization buffer only is 

3) PCR helpful to check that the slide is ready for use. The buffer is 
PCR is performed in a mixture containing, per reaction: 1 removed from the flow cell and replaced with 1 mL of 

fil genomic DNA; 4/d each primer (10 pmoV/d stocks); 4 fill (hydrolysed) RNA in hybridization buffer and mixed well. 

10 xPCR buffer (100 mM Tris.Cl pH8.5, 500 mM KC1, 15 Incubate for 15-30 min at 18° C. Remove the hybridization 

mM MgClJ; 4 fid 2 mM dNTPs (made from 100 mM dNTP 65 solution, which can be saved for subsequent experiments, 

stocks); 1 U Taq polymerase (Perkin Elmer, 5 V/fzl); H 2 0 to Rinse the flow cell 4-5 times with fresh changes of 6xSSPE/ 

40 fid. About 40 cycles (94° C. 30 sec, 55° C. 30 sec, 72° C. 0.1% Triton X-100, equilibrated to 18° C. The rinses can be 
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performed rapidly, but it is important to empty the flow cell 
before each Dew rinse and to mix the liquid in the cell 
thoroughly. The scan is performed in the presence of the 
labeled target. A series of scans at 30 min intervals using a 
hybridization temperature of 25° C. yields a very clear 
signal, usually in at least 30 min to two hours, but it may be . 
desirable to hybridize longer, i.e., overnight. Using a laser 
power of 50 /iW and 50 /an pixels, one should obtain 
maximum counts in the range of hundreds to low thousands/ 
pixel for a new slide. When finished, the slide can be 
stripped using 50% to 100% formamide at 50° C. for 30 min, 
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rinsing well in deionized H 2 0, blowing dry, and storing at 
room temperature. 

These conditions are illustrative and assume a probe 
length of -15 nucleotides. The stripping conditions sug- 
gested are fairly severe, but some signal may remain on the 
slide if the washing is not stringent. Nevertheless, the counts 
remaining after the wash should be very low in comparison 
to the signal in presence of target RNA. In some cases, much 
gentler stripping conditions are effective. The lower the 
hybridization temperature and the longer the duration of 
hybridization, the more difficult it is to strip the slide. Longer 
targets may be more difficult to strip than shorter targets. 



SEQUENCE LISTING 



( I J GENERAL INFORMATION: 

(Ml) NUMBER OF SEQUENCES: 360 



( 2 ) INFORMATION FOR SEQ ID NO:l: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: tingle 
( D ) TOPOLOGY: linear 

( I I ) MOLECULE TYPE; DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO: I: 

TTGCTGACGT CAGCC 

( 2 ) INFORMATION FOR SEQ ID NO:2: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pahs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:2: 



( 2 ) INFORMATION FOR SEQ ID NO:3: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( II) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:3: 



( 2 ) INFORMATION FOR SEQ ID NO:4: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( x \ ) SEQUENCE DESCRIPTION: SEQ ID NO:4: 
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TTGCTCACTT CAGCC 

( 2 ) INFORMATION FOR SEQ tD NO:5: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 39 base pain 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( I I ) MOLECULE TYPE: DNA (oligonucleotide) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NOJ: 

CATTAAAGAA AATATCATCT TTGGTGTTTC CTATGATGA 

( 2 ) INFORMATION FOR SEQ ID NO:6: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 36 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

CATTAAAGAA AATATCATTG GTGTTTCCTA T G AT G A 

( 2 ) INFORMATION FOR SEQ ID NO:7: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 36 base pairs 
( B ) TYPE nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (genomic) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

CATTAAAGAA AATATCATTG GTGTTTCCTA TGATGA 



( 2 ) INFORMATION FOR SEQ ID NO:* 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: IS base pairs 
( B ) TYPE* nucleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( I I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

AACACCAATG ATGAT 



1 5 



( 2 ) INFORMATION FOR SEQ ID NO:9: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:9: 

CCAAAGATNA TATTT 



( 2 ) INFORMATION FOR SEQ ID NO: 10: 
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( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pain 
( B ) TYPE: nucleic *cid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i j MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:i0s 

ACCAAAGANG ATATT 

( 2 ) INFORMATION FOR SEQ Q> NO:ll: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pain 
( B ) TYPE* nncleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:ll: 

CACCAAAGNT GATAT 



( 2 ) INFORMATION FOR SEQ Q> NO:12: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:12: 

ACACCAAANA T G A T A 
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( 2 ) INFORMATION FOR SEQ ID NO:13: 



( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pairs 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:13: 

AACACCAANG ATGAT 15 



( 2 ) INFORMATION FOR SEQ ID NO:14: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pairs 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:14: 



( 2 ) INFORMATION FOR SEQ ID NO: 15: 



( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pairs 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DNA (probe) 
(xi ) SEQUENCE DESCRIPTION: SEQ U> NO:15: 
GAAACACCNA AGATO 

( 2 ) INFORMATION FOR SEQ ID NO:l6: 

( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 

GGAAACACNA A AG A T 

( 2 ) INFORMATION FOR SEQ ID NO: 17: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE* DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:17: 

AGGAAACANC A A A G A 



( 2 ) INFORMATION FOR SEQ ID NO:18: 

( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 21 base pairs 
( B ) TYPE* nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

CCTTCAGAGG GTAAAATTAA G 



( 2 ) INFORMATION FOR SEQ ID NO:19: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 21 base pairs 
( B ) TYPE nncleic acid 
( C ) STRANDEDNESS: tingle 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:19: 



( 2 ) INFORMATION FOR SEQ ID NO:20: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 44 base pairs 
( B ) TYPE nocleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE DNA (probe) 



( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
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TAATACGACT CACTATAGGG AGATGACCTA ATAATGATGG GTTT 

( 2 ) INFORMATION FOR SEQ ID NO-.21: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 43 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS : single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:21: 

TAATACGACT CACTATAGGG AGTAGTGTGA AGGGTTCATA TGC 

( 2 ) INFORMATION FOR SEQ ID NO:22: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 45 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i 1 ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

CTCGGAATTA ACCCTCACTA AAGGTAGTGT GAAGGGTTCA TATGC 

( 2 ) INFORMATION FOR SEQ ID NO:23: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 43 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

(II) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:23: 

TAATACGACT CACTATAGGG AGAGCATACT AAAAGTGACT CTC 

( 2 ) INFORMATION FOR SEQ ID NO:24: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 44 base pairs 
( B ) TYPE* nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

TAATACGACT CACTATAGGG AGACATGAAT GACATTTACA GCAA 

( 2 ) INFORMATION FOR SEQ ID NO:25: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 44 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:25: 

CGGAATTAAC CCTCACT AAA GGACATGAAT GACATTTACA GCAA 



( 2 ) INFORMATION FOR SEQ ID NO:26: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE* nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

TTTATGGGGT G A 

( 2 ) INFORMATION FOR SEQ ID NO:27: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:27: 

TTGATTTATG GG 

( 2 ) INFORMATION FOR SEQ ED N0.28: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:2& 

AACCTATTTG ATT 

( 2 ) INFORMATION FOR SEQ CD NO:29: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:29: 

GGACCAAACC TA 

( 2 ) INFORMATION FOR SEQ CD NO:30: 

( i ) SEQUENCE CHARACTERISTICS: 
( A .) LENGTH: 12 base pain 
( B ) TYPE nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:3& 

AGGCTAGGAC CA 



( 2 ) INFORMATION FOR SEQ CD NO:31: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DNA (probe) 
( x i ) SEQUENCE DESCRIPTION: SEQ ID NOtJl: 
GGTOTGTGTG TCC 13 

( 2 ) INFORMATION FOR SEQ ID NO:32: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 14 base pairs 
( B ) TYPE nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:31 

CGGTGTGTGT GTGC 14 

( 2 ) INFORMATION FOR SEQ ID NO:33: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 14 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( Jt i ) SEQUENCE DESCRIPTION: SEQ ID NO:33: 

GGTGTGTGTG TGCT 14 

( 2 ) INFORMATION FOR SEQ ID NO:34: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:34: 

CTGCGTAGGA TC 12 



( 2 ) INFORMATION FOR SEQ ID NO:35: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:35: 

TGC TGGGT AG GA 



( 2 ) INFORMATION FOR SEQ ID NO:36: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:36: 
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TGTCCTGCGT AG 

( 2 ) INFORMATION FOR SEQ ID NO:37: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE: noclek acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: I hear 

( i I ) MOLECULE TYPE: DNA (probe) 

( s i ) SEQUENCE DESCRIPTION: SEQ ID NO J7: 

GTTAGCAGCG GT 

( 2 ) INFORMATION FOR SEQ ID NO:38: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pair* 
( B ) TYPE: nucleic acid 
{ C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ED NO:3& 

GGGTTAGCAG CG 

( 2 ) INFORMATION FOR SEQ ID NO:39: 

( j ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ED NO:39: 

AGCGGCGCAG G 

( 2 ) INFORMATION FOR SEQ ID NO:40: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH- 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( 1 I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ED NO:*0: 

AGCGGGGGAG 



( 2 ) INFORMATION FOR SEQ ED NO:41: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( 1 I ) MOLECULE TYPE: DNA (probe) 

( % i ) SEQUENCE DESCRIPTION: SEQ ED NO:41: 

GGTTGGTTCG G 



( 2 ) INFORMATION FOR SEQ ED NO:42: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ □) NO:42: 

CCGTTTCOTT GG 

( 2 ) INFORMATION FOR SEQ [D NO:43: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: L2 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:43: 

GATCTTTGGG GT 



( 2 ) INFORMATION FOR SEQ ID NO:44: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:44: 

GGGTGATCTT TG 



( 2 ) INFORMATION FOR SEQ ID NO:45: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:45: 



( 2 ) INFORMATION FOR SEQ ID NO:46: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH; L2 base pairs 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:46: 



( 2 ) INFORMATION FOR SEQ ID NO:47: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DXA (probe) 
( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:47: 
CCTACATAAA CTG 

( 2 ) INFORMATION FOR SEQ CD NO:48: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nocleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:4& 

GAGGTAAGCT ACA 

( 2 ) INFORMATION FOR SEQ ID NO:49: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:49: 

GAGG AGGTA A GC 

( 2 ) INFORMATION FOR SEQ ID NO:50: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: L2 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO^f>. 

TGCTTTGAGG AG 

( 2 ) INFORMATION FOR SEQ ID NO:51: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nacleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear . 

( i i ) MOLECULE TYPE: DNA (probe) 

(li) SEQUENCE DESCRIPTION: SEQ ID NO:51: 

AGTGTATTGC TTT 



( 2 ) INFORMATION FOR SEQ ID NO:52: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:52: 
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CATTTTCAGT GTA 

( 2 ) INFORMATION FOR SEQ ID NO:53: 

{ i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

(ti) SEQUENCE DESCRIPTION: SEQ ID NO:53: 

TAAACATTTT CAG 

( 2 ) INFORMATION FOR SEQ ID NO:54: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:54: 

AGCCCGTCTA AA 



( 2 ) INFORMATION FOR SEQ ID NO:55: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs. 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:55: 

GAGCCCGTCT AA 



1 2 



( 2 ) INFORMATION FOR SEQ ID NO:56: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:56: 

T G AT GTGAGC CC 



( 2 ) INFORMATION FOR SEQ ID NO:57: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I 1 ) MOLECULE TYPE- DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:57: 

GGGGTGATGT GA 



( 2 ) INFORMATION FOR SEQ ID NO:58: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ Q> NO J8: 

GAOTGGCAGG G 



( 2 ) INFORMATION FOR SEQ ID NO:59: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 b»sc pairs 
( B ) TYPE: nodeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:59: 



( 2 ) INFORMATION FOR SEQ ID NO:60: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: U base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:60: 

GATTAGTAGT ATGG 



( 2 ) INFORMATION FOR SEQ ID NO:61: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ED NO:<Sl: 



( 2 ) INFORMATION FOR SEQ CD NO:62: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:62: 



( 2 ) INFORMATION FOR SEQ ID NO:63: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nodeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 



59 



5,837,832 

-continued 



60 



( i i ) MOLECULE TYPE; DNA (probe) 
( z i ) SEQUENCE DESCRIPTION: SEQ ID NO:63: 
GGGTTCTATT G A A 

( 2 ) INFORMATION FOR SEQ ID NO:64: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 ba*e pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:64: 

GCGGGGGTTG 

( 2 ) INFORMATION FOR SEQ ID NO:65: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:65: 

ATGGGCGGGG 

( 2 ) INFORMATION FOR SEQ ID NO:66: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE* nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: 

TAGGATGGGC G 



1 3 



( 2 ) INFORMATION FOR SEQ ID NO:67: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH; 12 base pairs 
( B ) TYPE: nucleic odd 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:67: 

TGGGTAGGAT GG 



( 2 ) INFORMATION FOR SEQ ID NO:68: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE- DNA (probe) 



( * 1 ) SEQUENCE DESCRIPTION: SEQ ID NO:6& 
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GTGCTCGGTA GG 



( 2 ) INFORMATION FOR SEQ ID NO:69: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:69: 

TGTGTGTGCT GG 



( 2 ) INFORMATION FOR SEQ ID NO:70: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:70: 

GCGGTGTGTG TG 



( 2 ) INFORMATION FOR SEQ ID NO:71: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH 12 base pain 
( B ) TYPE: ouclcic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:71: 

TAGCAGCGGT GT 



( 2 ) INFORMATION FOR SEQ ID NO: 72: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:72: 

TGGCGTTAGC AG 



( 2 ) INFORMATION FOR SEQ ID NO: 73: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( 1 i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:73: 

GGTATGGGGT TA 



( 2 ) INFORMATION FOR SEQ ID NO:74: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nodcic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ tD NO:74: 

GT T CGGGG T A TG 

( 2 ) INFORMATION FOR SEQ ID NO:75: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:75: 

GCTGGTGTTA GG 



( 2 ) INFORMATION FOR SEQ ID NO:76: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA(probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:76: 

GGTTAGGCTG GT 



1 2 



( 2 ) INFORMATION FOR SEQ ID NO:77: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE- DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:77: 

AAATCTGGTT A G G 



( 2 ) INFORMATION FOR SEQ ID NO: 78: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I I ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ Q> NO:78: 

AAATTTGAAA TCT 



( 2 ) INFORMATION FOR SEQ ID NO:79: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE; DNA (probe) 
( x i ) SEQUENCE DESCRIPTION: SEQ ED NO:79: 
AAGATAAAAT TTG 



( 2 ) INFORMATION FOR SEQ ID NO:80: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:80: 

GCCAAAAAGA TA 



( 2 ) INFORMATION FOR SEQ ID NO:81: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:81: 

CGCCAAAA AG A 



( 2 ) INFORMATION FOR SEQ ID NO:82: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pain 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO^l 

CATACCGCCA A 



( 2 ) INFORMATION FOR SEQ ID NO:83: 

( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE* DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:83: 

AAAAGTGCAT ACC 



( 2 ) INFORMATION FOR SEQ ID NO:84: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( x 1 ) SEQUENCE DESCRIPTION: SEQ ID NO:84: 
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TCTTAAAAGT GCA 

( 2 ) INFORMATION FOR SEQ ID NO:85: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pair* 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

( 1 i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:85: 

GGGTGACTGT T A A 

( 2 ) INFORMATION FOR SEQ ID NO:86: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
< B ) TYPE: nucleic acid 
( C ) STRANDED NESS : single 
( D ) TOPOLOGY: linear 

( 1 i ) MOLECULE TYPE: DNA (probe) 

(Hi) SEQUENCE DESCRIPTION: SEQ ID NO:86: 

GGGGGTGACT GT 

( 2 ) INFORMATION FOR SEQ ID NO:87: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:87: 

AGTTGGGGGG T 

( 2 ) INFORMATION FOR SEQ ID NO:88: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

(It) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:88: 

TGTGTTAGTT GGG 



( 2 ) INFORMATION FOR SEQ ID NO:89: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( 1 i ) MOLECULE TYPE: DNA (probe) 

(si) SEQUENCE DESCRIPTION: SEQ ID NO:89: 

AAAATAATGT GTT 



( 2 ) INFORMATION FOR SEQ ID NO:90: 
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( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linen 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NOSQ: 



( 2 ) [NFORMATION FOR SEQ ED NO:91: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: MKkfc acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NOS1: 



( 2 ) INFORMATION FOR SEQ ID NO:92: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:92: 

GGAAATTTTT TG 12 



( 2 ) INFORMATION FOR SEQ ED NO:93: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:93: 



( 2 ) INFORMATION FOR SEQ ID NO:94: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x 1 ) SEQUENCE DESCRIPTION: SEQ ID NO:94: 



( 2 ) INFORMATION FOR SEQ ID NO:95: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: U base pairs 
( B ) TYPE: nocleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DNA (probe) 
( x i ) SEQUENCE DESCRIPTION: SEQ ID N035: 
GAGGCGGGCT T 



( 2 ) INFORMATION FOR SEQ ID NO:96: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pain 
( B ) TYPE: nucleic odd 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

SEQUENCE DESCRIPTION: SEQ ID NO:96: 

GCGGGGGAGG 



( 2 ) INFORMATION FOR SEQ ID NO:97: 

( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pair* 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x » ) SEQUENCE DESCRIPTION: SEQ [D NO:97: 

CAGAAGCCGG G 



( 2 ) INFORMATION FOR SEQ ID NO:98: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

(xi) SEQUENCE DESCRIPTION: SEQ tD NO:98: 

GTAGGCCAGA AG 



( 2 ) INFORMATION FOR SEQ ID NO:99: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID N039: 

GTGCTCTAGG CC 



( 2 ) INFORMATION FOR SEQ ID NO:100c 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:100: 
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TGTTTAACTG CTG 



( 2 ) INFORMATION FOR SEQ ID NOtIOL- 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS : single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

TGTGTTTAAG TGC 



( 2 ) INFORMATION FOR SEQ ID NO:102: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:102: 

GCAOAGATGT GTT 



( 2 ) INFORMATION FOR SEQ ID NO: 103: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:103: 



( 2 ) INFORMATION FOR SEQ ID NO:104: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

(ii) SEQUENCE DESCRIPTION: SEQ ID NO:104: 



( 2 ) INFORMATION FOR SEQ CD NO:105: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:105: 



( 2 ) INFORMATION FOR SEQ ID NO: 106: 
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( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nnclcic acid 
( C ) STRANDED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE- DNA (probe) 

(ii) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

TTTGTTTTTG GG 

( 2 ) INFORMATION FOR SEQ ID NO:107: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH; 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE- DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ IDNO:107: 

GGGTTCTTTG TT 



( 2 ) INFORMATION FOR SEQ ID NO:10& 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: onclcic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:108: 

GTGTTAGGGT TCT 



( 2 ) INFORMATION FOR SEQ ID NO:109: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: L4 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:109: 

TTTAGTAAGT ATGT 



( 2 ) INFORMATION FOR SEQ ID NO: 110: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

.( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:110: 

AACACACTTT AGT 13 



( 2 ) INFORMATION FOR SEQ ID N0:1U: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: U base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DNA (probe) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:lll: 
AATTAATTAA CACA 



( 2 ) INFORMATION FOR SEQ ID NCU12: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE: nndeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:112; 

AAGCATTAAT T A A 



( 2 ) INFORMATION FOR SEQ ID NO:113: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nadeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( i i ) SEQUENCE DESCRIPTION: SEQ ID NO: 1 13: 

GTCCTACAAG CAT 



( 2 ) INFORMATION FOR SEQ ID NO: 114: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

TGTCCTACAA CCA 



( 2 ) INFORMATION FOR SEQ ID NO:115: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:ll5: 

ATTATTATGT CCT 



( 2 ) INFORMATION FOR SEQ ID NO:ll6: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 14 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( * i ) SEQUENCE DESCRIPTION: SEQ ID NO:116: 
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TTGTTATTAT TATG 



( 2 ) DEFORMATION FOR SEQ ID NO:tl7: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH; 13 base pair* 
( B ) TYPE: aodcic tcki 
( C ) STRAND EDNESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:ll7: 

ATTCAAATTG T T A 



( 2 ) INFORMATION FOR SEQ ID NO: 118; 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS : single 
C D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:ll8: 



( 2 ) INFORMATION FOR SEQ ID NO: 119: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( 1 i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:119: 



( 2 ) INFORMATION FOR SEQ ID NO:12Gt 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( 1 I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:120: 



( 2 ) INFORMATION FOR SEQ ID NO:12l: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:l21: 



( 2 ) INFORMATION FOR SEQ ID NO: 122: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:122: 

CATGTCTGTG TGG 



( 2 ) INFORMATION FOR SEQ ID NO: 123: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE- nucleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

ATGATGTCTG TGT 



( 2 ) INFORMATION FOR SEQ ID NO:124: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

. ( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

TTTTGTTATG ATC 



( 2 ) INFORMATION FOR SEQ ID NO:125: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:125: 

TTTTTTGTTA T G A 



( 2 ) INFORMATION FOR SEQ ID NO:126: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:126: 

ATAGGGTGCT CC 



( 2 ) INFORMATION FOR SEQ ID NO: 127: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DNA (probe) 
( x i ) SEQUENCE DESCRIPTION: SEQ ID NOU27: 
GCGACATAGG gt 



( 2 ) INFORMATION FOR SEQ ID NO:l2& 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:128: 

TACTGCGACA TAG 



( 2 ) INFORMATION FOR SEQ ID NO:129: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:129: 

GACAGAT ACT GCG 



( 2 ) INFORMATION FOR SEQ ID NO:13fc 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:l3Ch 

AATCAAAGAC AGA 



( 2 ) INFORMATION FOR SEQ ID N0.131: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ fD NO:13l: 

AGGAATCAAA G A C 



( 2 ) INFORMATION FOR SEQ ID NO: 132: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:132: 
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T CAGCCACGA at 

( 2 ) INFORMATION FOR SEQ ID NO:U3: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE; DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:t33: 

ACGATGAGGC AG 

( 2 ) INFORMATION FOR SEQ ID NO: 134: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:134: 

AAATAATAGG ATG 



1 2 



( 2 ) INFORMATION FOR SEQ ID N0:135: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH- 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:135: 

GCGATAAATA AT 



( 2 ) INFORMATION FOR SEQ ID NO:136: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:136: 

TAGGATGCGA TA 



( 2 ) INFORMATION FOR SEQ ID N0:137: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:137: 



( 2 ) INFORMATION FOR SEQ ID NO: 138: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

'( i i ) MOLECULE TYPE: DNA (probe) 

( * I ) SEQUENCE DESCRIPTION: SEQ ID NO:138: 

TTGAACGTAG G A 



( 2 ) INFORMATION FOR SEQ ID NO:l3fc 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:139: 

AATATTGAAC G T A 



( 2 ) INFORMATION FOR SEQ ID NO:140t 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 



( 2 ) INFORMATION FOR SEQ ID NO: 141: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:141: 

TGTTCGCCTG TA 



( 2 ) INFORMATION FOR SEQ ID NO: 142: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO:142: 



( 2 ) INFORMATION FOR SEQ ID NO:143: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE* nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DNA (probe) 
( * i ) SEQUENCE DESCRIPTION: SEQ CD NO:143: 
CTCCCGTGAG TG 



( 2 ) INFORMATION FOR SEQ CD NO: 144: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 bate pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:144: 

GAGAGCTCCC GT 



( 2 ) INFORMATION FOR SEQ CD NO: 145: 

( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:145: 

ATGGAGAGCT CC 



( 2 ) INFORMATION FOR SEQ CD NO: 146": 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:l46: 

AATGCATGGA GA 



( 2 ) INFORMATION FOR SEQ CD NO:147: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic odd 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:147: 

ATACCAAATG CA 



( 2 ) INFORMATION FOR SEQ CD NO:14& 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( x i ) SEQUENCE DESCRIPTION: SEQ CO NO:148: 
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CACGAAAATA CCA 



( 2 ) INFORMATION FOR SEQ ID NO:149: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE: nucleic .acid 
( C ) STRAND EDNESS : single 
( D ) TOPOLOGY: linear 

( I 1 ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:149: 

CCCACACCAA a 



( 2 ) INFORMATION FOR SEQ ID NO:15Gt 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE- DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:150: 

TACCCCCCAG A 



( 2 ) INFORMATION FOR SEQ ID NO:l5l: 

( i ) SEQUENCE CHARACTERISTICS: 
.( A ) LENGTH: 11 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:l51: 

TGCATACCCCC 



( 2 ) INFORMATION FOR SEQ ID NO: 152: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:152: 

TCGCGTGCAT AC 



( 2 ) INFORMATION FOR SEQ ID NO:153: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID N0:153: 

G ACTA T CG CG TG 



( 2 ) INFORMATION FOR SEQ ID NO:l54: 
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( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE: Qodeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: D.VA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:154: 



( 2 ) INFORMATION FOR SEQ ID NO: 155: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
{ D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:155: 



( 2 ) INFORMATION FOR SEQ ID NO:156: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE- nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ED NO:156: 



( 2 ) INFORMATION FOR SEQ ID NO:157 : 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: Linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x f ) SEQUENCE DESCRIPTION: SEQ ED NO: 157; 

CTCCAGCGTC TC 



( 2 ) INFORMATION FOR SEQ CD NO:15& 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ CD NO:158: 



( 2 ) INFORMATION FOR SEQ CD NO:159: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: LI base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE- D.\A (probe) 
( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 
GTGCTCCOCC T 



( 2 ) INFORMATION FOR SEQ tD NaifiOt 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: U base pain 
( B ) TYPE: nucleic ocid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:i60: 

GACCCTGAAG TAG 



( 2 ) INFORMATION FOR SEQ ID NO:161: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:161: 

TTTATGA CCC T G A 



( 2 ) INFORMATION FOR SEQ ID NOU62: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE nucleic odd 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:162: 

TTTAGGCTTT ATG 



( 2 ) INFORMATION FOR SEQ ED N0:163: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRD7TION: SEQ ED NO:163: 

GCTATTTAGG CT 



( 2 ) INFORMATION FOR SEQ ED N0:164: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE DNA (probe) 



( x i ) SEQUENCE DESCRIPTION: SEQ CD N0:1W: 
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TGGCCTATTT AC 

( 2 ) INFORMATION FOR SEQ CD NO:165: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( 3i i ) SEQUENCE DESCRIPTION: SEQ ID NO:165: 

ACGTGTGGGC TA 



( 2 ) INFORMATION FOR SEQ ID NO:166: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:166: 

AGGGGAACGT GT 



( 2 ) INFORMATION FOR SEQ ID NO: 167: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
.( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE* DNA (probe) 

SEQUENCE DESCRIPTION: SEQ ED NO:167: 



( 2 ) INFORMATION FOR SEQ ID NO:l6& 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 14 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:168: 



( 2 ) INFORMATION FOR SEQ ID NO:169: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( * i ) SEQUENCE DESCRIPTION: SEQ ID NO:169: 



( 2 ) INFORMATION FOR SEQ ID NO:17ft 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:170: 

TCCATCGTGA TG 

( 2 ) INFORMATION FOR SEQ ID NO: 1 71: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:171: 

GATGATCCAT CG 

( 2 ) INFORMATION FOR SEQ ID N0:172: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH; 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:172: 

AGACCTGATG ATC 

( 2 ) INFORMATION FOR SEQ ID NO: 173: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nndeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:173: 

GGGT G AT AG A CCT 

( 2 ) INFORMATION FOR SEQ ID NO: 174: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: U base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 

ATAGGGTGAT AGA 



( 2 ) INFORMATION FOR SEQ ID NO:175: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE; DNA (probe) 
' (ii) SEQUENCE DESCRIPTION: SEQ CD NO:175: 
TGGT TAATAG G G 



( 2 ) INFORMATION FOR SEQ CD NOU76: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:176: 

GTGAGTGGTT A A T 



( 2 ) INFORMATION FOR SEQ ID NO:177: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:177: 

TGTGCGGGAT AT 



( 2 ) INFORMATION FOR SEQ CD NO: 178: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ [D NO:178: 

ACTCTTGTGC GG 



( 2 ) INFORMATION FOR SEQ CD NO:179: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:179: 

TAGCACTCTT GTG 



( 2 ) INFORMATION FOR SEQ CD NO:180: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE: nodeic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( x I ) SEQUENCE DESCRIPTION: SEQ CD NO:180: 
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( 2 ) INFORMATION FOR SEQ 05 NOU81: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH; 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS : single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:l81: 

GCCACCACAG ta 



( 2 ) INFORMATION FOR SEQ ID NO: 182: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: il base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS . single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:182: 

CGGAGCGAGG A 



( 2 ) INFORMATION FOR SEQ ID NO: 183: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ D> NO: 183: 

GGCCCGGAGC 



( 2 ) INFORMATION FOR SEQ ID NO: 184: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: tl base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:184: 

TTATGGGCCC G 



( 2 ) INFORMATION FOR SEQ ID NO: 185: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( ji i ) SEQUENCE DESCRIPTION: SEQ ID NO:i85: 

AGTGTTATGG GC 



( 2 ) INFORMATION FOR SEQ CD NO:186: 



5,837,832 

105 106 

-continued 



( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 b*sc pairs 
( B ) TYPE nucleic acid 
( C ) STRAND EDiVESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE* DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ tD NOtl86: 

TACCCCCAAG TO 



( 2 ) INFORMATION FOR SEQ ID NO:187: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:187: 

TTTAGCTACC CC 



( 2 ) INFORMATION FOR SEQ ID NO:l8& 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESOUFTTON: SEQ ID NO: 188: 

TTCACTTTAG C T A 



( 2 ) INFORMATION FOR SEQ ID NO:189: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:189: 

TACAGTTCAC TTT 



( 2 ) INFORMATION FOR SEQ ID NO:190: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH; 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:190: 

TCGAGATACA GTT 



( 2 ) INFORMATION FOR SEQ ID NO:l51: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: t3 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DNA (probe) 
( x i ) SEQUENCE DESCRIPTION: SEQ tD NO:19l: 
CAGATGTCGA GAT 



( 2 ) INFORMATION FOR SEQ ID NO: 192: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH; 12 bate pairs 
( B ) TYPE: nncleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION:. SEQ ID NO:192: 

AGGAACCAGA TG 



( 2 ) INFORMATION FOR SEQ ID NO:193: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH- 13 base paiis 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:193: 

GAAGTAGGAA CCA 



( 2 ) INFORMATION FOR SEQ ID NO:194: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 

GACTGTAATG TG C 



( 2 ) INFORMATION FOR SEQ ID NO:195: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:195: 

GGGATTTGAC TG T 



( 2 ) INFORMATION FOR SEQ U> N0:196: 

( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:196: 
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AGCGATTTGA CT 



( 2 ) INFORMATION FOR SEQ ID NCfcl97: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: L2 base pairs 
( B ) TYPE: Qodeic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 



( 2 ) INFORMATION FOR SEQ ID NO:l9& 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE uoclcic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 



( 2 ) INFORMATION FOR SEQ ID NO: 199: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:199: 



( 2 ) INFORMATION FOR SEQ ID NO:200: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH- 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE- DNA (probe) 

( % i ) SEQUENCE DESCRIPTION: SEQ ID NO:200: 



( 2 ) INFORMATION FOR SEQ ID NO:201: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:201: 



( 2 ) INFORMATION FOR SEQ ID NO:202: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nndeic kcid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:202: 

TATCTGAGGG GG 

( 2 ) INFORMATION FOR SEQ ID NO:203: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic »cid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:2C3: 

ACCCCTATCT GA 

( 2 ) INFORMATION FOR SEQ ID NO:204: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:2G4: 

AGGGACCCCT A 

( 2 ) INFORMATION FOR SEQ ID NO:205: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:205: 

TGGTCAAGGG AC 

( 2 ) INFORMATION FOR SEQ ED NO:206: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nndeic acid 
( C ) STRANDEDNESS: tingle 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:206: 

GGATGGTGGT CA 



( 2 ) INFORMATION FOR SEQ ID NO:207 : 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nndeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DNA (probe) 
C x i ) SEQUENCE DESCRIPTION: SEQ ID NO:207; 
AGGATCGTGG TC 



( 2 ) INFORMATION FOR SEQ ID NO:208: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:208: 

ACACGGAGGA TG 



( 2 ) INFORMATION FOR SEQ ID NO:209: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:209: 

TGATTTACAC GG 



( 2 j INFORMATION FOR SEQ ff» NO:210t 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:210: 

GGGATATTGA TTT 



( 2 ) INFORMATION FOR SEQ ID NO:2U: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
.(D) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:211: 

GTGGCATTTG OA 



( 2 ) INFORMATION FOR SEQ ID NO:212: 

( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:212: 
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AGGGGT GCCA T 

( 2 ) INFORMATION FOR SEQ Q> NO:213: . 

( i ) SEQUENCE CHARACTERISTICS: 
* ( A ) LENGTH; 11 base pain 
( B ) TYPE: nodeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

SEQUENCE DESCRIPTION: SEQ ID NO:21J: 

GGTGAGGGGT G 

( 2 ) INFORMATION FOR SEQ ID NO:2U: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH- 12 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( 1 I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:214: 

AGTGGGTGAG GG 

( 2 ) INFORMATION FOR SEQ ID NO:215: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH 13 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:215: 

GTATCCTAGT GGG 

( 2 ) INFORMATION FOR SEQ ID NO:216: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:216: 

TTTGTTGGTA TCC 



( 2 ) INFORMATION FOR SEQ ID NO:217: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE- DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:217 : 

GTAGGTTTGT TCC 



( 2 ) INFORMATION FOR SEQ ID NO:2l8i 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE: nndeic acid 
( C ) STRANDEDNESS: single 
< D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x 1 ) SEQUENCE DESCRIPTION: SEQ ID N0:218: 

TGGGTAGGTT TG 



( 2 ) INFORMATION FOR SEQ ID NO:219: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nudcic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:219: 

T A AGGGTGGG TA 



1 2 



( 2 ) INFORMATION FOR SEQ ID NO:220t 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B )TYPE: nndeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:220: 

GTACTGTTAA GGG 



( 2 ) INFORMATION FOR SEQ ID NO:22l: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 14 base pairs 
( B ) TYPE: nndeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:221: 



( 2 ) INFORMATION FOR SEQ ID NO:222: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nndeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE- DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:222: 



( 2 ) INFORMATION FOR SEQ ID NO:223: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nndeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 



119 



5,837,832 

-continued 



120 



( i i ) MOLECULE TYPE: DNA (probe) 
( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:223: 
AAATGGCTTT AT 

( 2 ) INFORMATION FOR SEQ ID NO:224: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:224: 

GGTAAATGGC TT 

( 2 ) INFORMATION FOR SEQ ID NO:225: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:225: 

TGTACCGTAA ATG 

( 2 ) INFORMATION FOR SEQ ID NO:226: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:226: 

GTGCTAATGT ACG 

( 2 ) INFORMATION FOR SEQ ID NO:227: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE- nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:227: 

TAATGTGCTA ATG 



( 2 ) INFORMATION FOR SEQ ID NO:22& 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 11 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE- DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:22& 
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CATGGGCAGG G 

( 2 ) INFORMATION FOR SEQ ID NO-.229: 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH: 12 base pairs ' 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:229: 

TGTAAGCATG GG 



( 2 ) INFORMATION FOR SEQ ID NO:230. 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

(n) SEQUENCE DESCRIPTION: SEQ ID NO:230: 

TTGCTTGTAA G C A 



1 3 



( 2 ) INFORMATION FOR SEQ ID NO:23l: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:231: 

TGTACTTGCT TGT 

( 2 ) INFORMATION FOR SEQ ID NO:232: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( I 1 ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:232: 

TTGCTGTACT TGC 



( 2 ) INFORMATION FOR SEQ ID NO:233: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:233: 

GGTTGAT TGC TG 



( 2 ) INFORMATION FOR SEQ ID NO:234: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: aodeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ GD NO:234: 

TTGAGGCTTC AT 

( 2 ) INFORMATION FOR SEQ ID NO:235: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nodcic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:235: 

GT GAT AG T TG AGG 



( 2 ) INFORMATION FOR SEQ ID NO:236: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:236: 

TTGATGTGTG ATA 



( 2 ) INFORMATION FOR SEQ CD NO:237: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ED NO:237: 

TGCAGTTGAT G TG 



( 2 ) INFORMATION FOR SEQ ID NO:23& 

( i ) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH 12 base pairs - 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:238: 

TGGAGTTGCA GT 



( 2 ) INFORMATION FOR SEQ CD NO:239: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DNA (probe) 
( x i ) SEQUENCE DESCRIPTION: SEQ ID Mfc239: 
ATTTGGAGTT GC 12 

( 2 ) INFORMATION FOR SEQ Q> NO:240t 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 ba*e pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NCh240: 

TACCGTACAA TAT 1 3 

( 2 ) INFORMATION FOR SEQ ID NO:241: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( 1 i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID Mh24l: 

TGGTACCGTA CAA . 13 

( 2 ) INFORMATION FOR SEQ ID NO:242: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

(iti) SEQUENCE DESCRIPTION: SEQ ID X0242: 

TATTTATGGT ACC 13 

( 2 ) INFORMATION FOR SEQ ID NO:243: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

: ( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO-J43: 

GGTCAAGTAT T T A 1 3 

( 2 ) INFORMATION FOR SEQ ID NO:244: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x 1 ) SEQUENCE DESCRIPTION: SEQ ID NO--244: 
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TACAGOTGGT CAA 

( 2 ) INFORMATION FOR SEQ ID NO:245: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE- DNA (probe) 

( i i ) SEQUENCE DESCRIPTION: SEQ ID NO:245: 

ATGTACTACA GGT 

( 2 ) INFORMATION FOR SEQ ID NO:246: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:24ti: 

GGTTTTTATG TAC 

( 2 ) INFORMATION FOR SEQ ID NO:247: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( I i ) SEQUENCE DESCRIPTION: SEQ ID NO:247 : 

GGATTGGGTT TT 

( 2 ) INFORMATION FOR SEQ ID NO:24& 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i 1 ) MOLECULE TYPE: DNA (probe) 

(n i ) SEQUENCE DESCRIPTION: SEQ ID NO:248: 

TGTAGGATTG GG 



( 2 ) INFORMATION FOR SEQ ID NO:249: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x 1 ) SEQUENCE DESCRIPTION: SEQ ID NO:249: 

GTTTTGATGT AGG 



( 2 ) INFORMATION FOR SEQ ID NO:25Q: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: Qttdcic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ 0) NO:250: 

CGGTTTTCAT GT 



( 2 ) INFORMATION FOR SEQ ID NO:25l: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: U base pain 
( B ) TYPE: nnclcic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DXA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:25l: 

OGAGGCGGTT T 



( 2 ) INFORMATION FOR SEQ ID NO:252: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

(il) SEQUENCE DESCRIPTION: SEQ ID NO:252: 

GTCAATACTT GGG 



( 2 ) INFORMATION FOR SEQ ID NO:253: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:253: 

GGGTGAGTCA ATA 



( 2 ) INFORMATION FOR SEQ ID NO:254: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:254: 

TGGGTGAGTC AA 



( 2 ) INFORMATION FOR SEQ ID NO:255: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: tingle 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: D.VA (probe) 
( m i ) SEQUENCE DESCRIPTION: SEQ ED NO:255: 
TGTTGATGGG TG 



( 2 ) INFORMATION FOR SEQ CD NO:256: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:256: 

CGGTTGTTGA TG 



( 2 ) INFORMATION FOR SEQ ED NO:257: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:257: 

ACATAGCGGT TG 



( 2 ) INFORMATION FOR SEQ CD NO:25& 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:258: 

GAAAATACAT AGC 



( 2 ) INFORMATION FOR SEQ CD NO:259: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:259: 

AATGTACGAA A A T 



( 2 ) INFORMATION FOR SEQ CD NO:26a 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( x I ) SEQUENCE DESCRIPTION: SEQ CD NO:260: 
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GCACTAATGT ACG 
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( 2 ) INFORMATION FOR SEQ ID NCh261: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:26l: 

TGGCTGGCAG TA 

( 2 ) INFORMATION FOR SEQ ID NO:262: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE- nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( f i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:262: 

TCATGGTGGC TG 

( 2 ) INFORMATION FOR SEQ ID NO:263: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:263: 

ACAATATTCA TGG 

( 2 ) INFORMATION FOR SEQ ID NO:264: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

(si) SEQUENCE DESCRIPTION: SEQ ID NO:264: 

TAGAATCTTA GCT 



( 2 ) INFORMATION FOR SEQ ID NO:265: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: aucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I 1 ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:265: 

TTTAAATTAG A A T 



( 2 ) INFORMATION FOR SEQ ID NO:266: 
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( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: L3 base pairs 
( B ) TYPE: nndck acid 
( C ) STRAND ED NESS: single 
( 0 ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO-.266: 

GAATAAGTTT AAA 

( 2 ) INFORMATION FOR SEQ ID NO:267: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: L3 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DXA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:267: 

CAACAGAGAA T A A 

( 2 ) INFORMATION FOR SEQ ID NO:26& 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

. ( i i ) MOIECULE TYPE: DNA(probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:268: 

AAAGAACAGA G A A 



( 2 ) INFORMATION FOR SEQ ID NO:269: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:269: 

CCCATGAAAG AA 
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( 2 ) INFORMATION FOR SEQ CD NO:270: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:270: 



( 2 ) INFORMATION FOR SEQ ID NO:271: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE* nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DNA (probe) 
( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:271: 
ATCTGCTTCC CC 



( 2 ) INFORMATION FOR SEQ ED NO:272: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:272: 

CAAATCTGCT TC 



( 2 ) INFORMATION FOR SEQ ID NO:273: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:273: 

GGTACCCAAA TC 



( 2 ) INFORMATION FOR SEQ ID NO:274: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B )TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
{ D ) TOPOLOGY: linear 

(it) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:274: 

GGTGGTACCC AA 



( 2 ) INFORMATION FOR SEQ ID NO:275: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:275: 

TACTTGGCTG GT 



( 2 ) INFORMATION FOR SEQ CD NO:276: 

( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( x I ) SEQUENCE DESCRIPTION: SEQ CD NO:276: 
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TCGAAAAAGG tt 

( 2 ) INFORMATION FOR SEQ W NCk277: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 but pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( 1 I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:277: 

GTCCTTGGAA AA 

( 2 ) INFORMATION FOR SEQ ID NO:278: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:27& 

ATTTGTCCTT GG 

( 2 ) INFORMATION FOR SEQ ID NO:279: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE* DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:279: 

CTCTGATTTG TCC 

( 2 ) INFORMATION FOR SEQ ID NO:280: 

( i.) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B )TYPE:nncleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( f I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:280: 

TTTTTCTCTG ATT 



( 2 ) INFORMATION FOR SEQ ID NO:281: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( M ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:281: 

TAAAGACTTT TTC 



( 2 ) INFORMATION FOR SEQ ID NO:282: 
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( ! ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( I i ) SEQUENCE DESCRIPTION: SEQ ID NO:282: 

GTGGAGTTAA AGA 
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( 2 ) INFORMATION FOR SEQ ID NO:283: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 13 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:283: 

TGGTGGAGTT AAA 



( 2 ) INFORMATION FOR SEQ ID NO:284: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:284: 

TGCTAATGGT GG 



( 2 ) INFORMATION FOR SEQ ID NO:285: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:285: 

TTGGGTGCTA AT 12 



( 2 ) INFORMATION FOR SEQ ID NO:286: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:286: 



( 2 ) INFORMATION FOR SEQ ID NO:287 : 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE- nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE; DNA (probe) 
( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:287: 
TCTTAGCTTT CG 



( 2 ) INFORMATION FOR SEQ ID NO:28& 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 22 base pain 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:2S8: 

CACTTGTGCC CTGACTTTCA AC 



( 2 ) INFORMATION FOR SEQ ID NO:28fc 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 49 base pairs 
( B ) TYPE nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:289: 

ATGCAATTAA CCCTCACTAA AGGGAGACAC TTGTGCCCTG ACTTTCAAC . 

( 2 ) INFORMATION FOR SEQ ID NO:290: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 25 base pairs 
( B ) TYPE; nucleic acid 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE; DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:290: 

GACCCTGGGC AACCAGCCCT GTCGT 



( 2 ) INFORMATION FOR SEQ ID NO:29l: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 47 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:291: 

TAATACGACT CACTATAGGG AGGACCCTGG GCAACCAGCC CTGTCGT 



( 2 ) INFORMATION FOR SEQ ID NO:292: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 25 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE; DNA (probe) 



( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:292: 
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CTAGAATTCT GTTGACTCAG ATTGG 

( 2 ) INFORMATION FOR SEQ ED NO:293: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH; 27 base pairs ' 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:293: 

AAATCCATAC AATACTCCAG TATTTGC 



( 2 ) INFORMATION FOR SEQ ID NO:294: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 27 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:294: 

GATAAGCTTG GGCCTTATCT ATTCCAT 
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( 2 ) INFORMATION FOR SEQ ID NO:295: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 28 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( f 1 ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:295: 

ACCCATCCAA AGGAATGGAG GTTCTTTC 

( 2 ) INFORMATION FOR SEQ ID NO:296: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( M ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:296: 

AGCCTAGCTG AA 



( 2 ) INFORMATION FOR SEQ ID NO:297: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:297: 

TCGGATCGAC TT 
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( 2 ) INFORMATION FOR SEQ ID NO:29& 



5,837,832 

147 148 

-continued 



( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 22 base pain 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

(-i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:29& 

CGGAATTAAC CCTCACTAAA GG 



( 2 ) INFORMATION FOR SEQ ID NO:299: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 22 base pain 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:299: 

AATTAACCCT CACTAAAGGG AG 



( 2 ) INFORMATION FOR SEQ ID NO:30Q: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 22 base pain 
( B ) TYPE nucleic acid 
.(C) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID N0:3G0: 

TAATACGACT CACTATAGGG AG 



( 2 ) INFORMATION FOR SEQ ID N0:301: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 20 base pairs 
( B ) TYPE: nndeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:301: 

ATTTAGGTGA CACTATAGAA 



( 2 ) INFORMATION FOR SEQ ID NO:302: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pain 
( B ) TYPE: nndeic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ED NO:302: 

GATNATATTT 



( 2 ) INFORMATION FOR SEQ CD NO:303: 

( i ) SEQUENCE CHARACTERISTICS; 

( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DNA (probe) 
( x i ) SEQUENCE DESCRIPTION: SEQ CD NOJ03: 
AGANGATATT 



( 2 ) INFORMATION FOR SEQ ED NOJ04: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pain 
( B ) TYPE: nocleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:304: 



( 2 ) INFORMATION FOR SEQ CD NO:305: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:305: 

AAANATGATA .10 



( 2 ) INFORMATION FOR SEQ CD NO:306: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:306: 



( 2 ) INFORMATION FOR SEQ CD NO:307: 

( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nocleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

. ( i i ) MOLECULE TYPE: DNA (probe). 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:307: 



( 2 ) INFORMATION FOR SEQ CD NO:30& 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( x I ) SEQUENCE DESCRIPTION: SEQ CD NOJ08: 
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ACCNAACATG 

( 2 ) INFORMATION FOR SEQ Q> NO:309: 

..<!■) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDED NESS : single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:309: 

C ACNAAAGAT 

( 2 ) INFORMATION FOR SEQ ID NO:31tt 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS : single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:310: 

AGAAACNACA 

( 2 ) INFORMATION FOR SEQ ID NO:3U: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 16 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS; single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:3U: 

ATTTCATTCT GTATTG 

( 2 ) INFORMATION FOR SEQ ID NO:3ll 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 16 base pair* 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE* DNA (probe) 

(si) SEQUENCE DESCRIPTION: SEQ Q> N0:312: 

CCGACTGCAG TCGTTA 



( 2 ) INFORMATION FOR SEQ ID NO:313: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

(II) MOLECULE TYPE; DNA (probe) 

(ri) SEQUENCE DESCRIPTION: SEQ ID NO:3U: 

CCGACTGCAG TCGTT 



( 2 ) INFORMATION FOR SEQ ID NO:314: 
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( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pain 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x 1 ) SEQUENCE DESCRIPTION: SEQ ED NO:314: 

CCGACTACAG TCGTT 

( 2 ) INFORMATION FOR SEQ ID NO:315: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO J15: 

CCGACTCCAG TCGTT 



( 2 ) INFORMATION FOR SEQ ID NO:316: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 15 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:3 16: 

CCGACTTCAG TCGTT 
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( 2 ) INFORMATION FOR SEQ ID NO:317: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 35 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:317 : 

GTAATTTCTT TTATAGTAGA AACCACAAAC GATAC 
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( 2 ) INFORMATION FOR SEQ ID NO:31& 

( j ) SEQUENCE CHARACTERISTICS: 
. ( A ) LENGTH: 35 base pairs 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (oligonucleotide) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID N0:318: 

CATTAAACAA AATATCATCT TTGGTGTTTC CTATG 



( 2 ) INFORMATION FOR SEQ ID NO:319: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 32 base pairs 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DNA (oligonucleotide) 
(xi) SEQUENCE DESCRIPTION: SEQ U> NO J19: 
CATTAAAGAA AATATCATTG GTGTTTCCTA TO 3 2 

( 2 ) INFORMATION FOR SEQ Q> NO:32Ct 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 18 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:320: 

CATTAAAGAA AATATCAT 18 

( 2 ) INFORMATION FOR SEQ ID NO:321: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 35 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:32l: 

TATTAAAGAA A ATA T C AT C T TTGGTGTTTC CTATC 35 

( 2 ) INFORMATION FOR SEQ ID NO:322: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 35 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:322: 

CCTTAAAGAA AATATCATCT TTGGTGTTTC CTAAA 35 



( 2 ) INFORMATION FOR SEQ ID NO:323: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 35 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucletidc) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:323: 

CTTTAAAGAA AATAAAAAAA TTGGTGTTTC CTAAA 



( 2 ) INFORMATION FOR SEQ ID NO:324: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 20 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:324: 
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GCAAGTCTCC CATTTTAATT 

( 2 ) INFORMATION FOR SEQ ID N0325: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 20 base pairs 
( B ) TYPE- nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i t ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:325: 

CCTTCAGAGG CTAAAATTAA 

( 2 ) INFORMATION FOR SEQ ID NO:326: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 20 base pairs 
( B ) TYPE: nucleic add 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:326: 

CCTTCAGAGK GTAAAATTAA 

( 2 ) INFORMATION FOR SEQ ID NO:J27: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 20 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS : single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:327: 

CCTTCAGAGT GTAAAATTAA 

( 2 ) INFORMATION FOR SEQ ID NO:32& 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 19 base pairs 
( B ) TYPE nocleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE DNA (probe) 

(xi) SEQUENCE DESCRIPTION: SEQ CD NO:328: 

CCTTCAGAGG GTAAAATCA 



( 2 ) INFORMATION FOR SEQ CD NO:329: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 19 base pairs 
( B ) TYPE nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:329: 

CCTTCAGAGG GTAAAATTA 



( 2 ) INFORMATION FOR SEQ CD NO:330: 
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( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 19 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS : single 
( 0 ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (pobe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NOJ30: 

GATTCAGAGT GTAAAATAC 

( 2 ) INFORMATION FOR SEQ ID NO:331: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 19 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDED NESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:33l: 

AAAAAAGAGT CTAAAATGA 



( 2 ) INFORMATION FOR SEQ ID NO:332: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 35 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:332: 

CATTAAAGAA AATAACATCA TTGGTGTTTC CTATG 
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( 2 ) INFORMATION FOR SEQ ID NO:333: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 648 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE* DNA (oligonucleotide) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:333: 

AACAAACCTA CCCACCCTTA ACAGTACATA GTACATAAAG CCATTTACCG TACATAGCAC 60 

ATTACAGTCA AATCCCTTCT CGTCCCCATG GATGACCCCC CTCAGATAGG GGTCCCTTGA 120 

CCACCATCCT CCGTGAAATC AATATCCCGC ACAAGAGTGC TACTCTCCTC GCTCCGGGCC 180 

CATAACACTT GGGGGTAGCT AAAGTGAACT GTATCCGACA TCTGGTTCCT ACTTCAGGGT 240 

CATAAAGCCT AAATAGCCCA CACGTTCCCC TTAAATAAGA CATCACGATG GATCACAGGT 300 

CTATCACCCT ATTAACCACT CACGGGAGCT CTCCATGCAT TTGGTATTTT CGTCTGGGGG 360 

GTATGCACGC GATAGCATTG CGAGACGCTG GAGCCGGAGC ACCCTATGTC GCAGTATCTG 420 

TCTTTGATTC CTGCCTCATC CTATTATTTA TCGCACCTAC GTTCAATATT ACAGGCGAAC 480 

ATACTTACTA AAGTGTGTTA ATTAATTAAT GCTTGTAGGA CATAATAATA ACAATTGAAT 540 

GTCTGCACAG CCACTTTCCA CACAGACATC ATAACAAAAA ATTTCCACCA AACCCCCCCT 600 

CTCCCCCGCT TCTGGCCACA GCACTTAAAC ACATCTCTGC CAAACCCC 648 



( 2 ) INFORMATION FOR SEQ ID NO:334: 
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( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY; linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x 1 ) SEQUENCE DESCRIPTION: SEQ ID NO:334: 

GATGCTGAGG AG 

( 2 ) INFORMATION FOR SEQ ID NO:335: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pair* 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:335: 

CTCCTCCCCG GT 



( 2 ) INFORMATION FOR SEQ CD NO:336: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ED NO:336: 

ACTCCTCCCC GG 



( 2 ) INFORMATION FOR SEQ ID NO:337: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:337: 



( 2 ) INFORMATION FOR SEQ CD NO:33& 

( i ) SEQUENCE CHARACTERISTICS^ 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic add 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( 1 i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ CD NO:338: 



( 2 ) INFORMATION FOR SEQ ID NO:339: 



( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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f i i ) MOLECULE TYPE: DNA (probe) 
. ( x i ) SEQUENCE DESCRIPTION: SEQ tD NOJ39: 
ACCACTCCTC CC 12 

( 2 ) INFORMATION FOR SEQ ID NO:340t 

( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nuciek acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:340: 

T ACGACTCC T CC 12 

( 2 ) INFORMATION FOR SEQ ID NO:341: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:341: 

CTACGACTCC TC 12 

( 2 ) INFORMATION FOR SEQ ID NO:342: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:342: 

TCTACGACTC CT 1 2 

( 2 ) INFORMATION FOR SEQ ID NO:34J: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nncleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:343: 

TTCTACGACT CC 12 



( 2 ) INFORMATION FOR SEQ ID NO:344: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 



( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:344: 
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ATTCTACGAC TC 

( 2 ) INFORMATION FOR SEQ ID SOMS: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 ba*c pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND ED NESS: single 
( D ) TOPOLOGY: linear 

( i 1 ) MOLECULE TYPE- DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO MS : 

TATTCTACGA CT 

( 2 ) INFORMATION FOR SEQ ID NO:346: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: Ducleic acid 
( C ) STRAND EDNESS : single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO JA6: 

CTATTCTACG AC 



( 2 ) INFORMATION FOR SEQ ID NO:347: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 12 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:347: 

CCTATTCTAC GA 
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( 2 ) INFORMATION FOR SEQ ID NO:34& 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( I i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ U> NO 348: 

TCCTCCCCGG 



( 2 ) INFORMATION FOR SEQ ID NO:349: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:349: 

CTCCTCCCCG 



( 2 ) INFORMATION FOR SEQ ED NO:35Cfe 
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( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: LO base pain 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
. ( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NOJ50: 

ACTCCTCCCC 

( 2 ) INFORMATION FOR SEQ ID NO:35b 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( * i ) SEQUENCE DESCRIPTION: SEQ ID NO:351: 

GACTCCTCCC 



( 2 ) INFORMATION FOR SEQ ID NO:352; 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i . i ) MOLECULE TYPE: DNA (probe) 

Cxi) SEQUENCE DESCRIPTION: SEQ ID NO:352: 

CGACTCCTCC 



( 2 ) INFORMATION FOR SEQ ID NO:353: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x I ) SEQUENCE DESCRIPTION: SEQ ID NO:353: 



( 2 ) INFORMATION FOR SEQ ID NO:354: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i I ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ Q> NOlJ54: 



( 2 ) INFORMATION FOR SEQ ID NO:355: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: DNA (probe) 
( x i ) SEQUENCE DESCRIPTION: SEQ ID NO-J55: 
CTACGACTCC . j 0 

( 2 ) INFORMATION FOR SEQ ID NO:356: 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic odd 
( C ) STRAND ED NESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:356: 

TCTACOACTC 10 

( 2 ) INFORMATION FOR SEQ ID NO:357 : 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: L0 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRAND EDNESS : single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:357: 

TTCTACGACT . 10 

( 2 ) INFORMATION FOR SEQ ED NO:35& 

( I ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE* nucleic acid 
( C ) STRAND EDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:358: 

ATTCTACGAC 10 



( 2 ) INFORMATION FOR SEQ ID NO:3S9: 

( 1 ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 10 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

( i i ) MOLECULE TYPE: DNA (probe) 

( x i ) SEQUENCE DESCRIPTION: SEQ ID NO:359: 



( 2 ) INFORMATION FOR SEQ ID NO:360: 

( i ) SEQUENCE CHARACTERISTICS: 
( A ) LENGTH: 184 base pairs 
( B ) TYPE: nucleic acid 
( C ) STRANDEDNESS: single 
( D ) TOPOLOGY: linear 

'( ' 1 ) MOLECULE TYPE: DNA (oligonucleotide) 



( x i ) SEQUENCE DESCRIPTION: SEQ ID NOJ60: 
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TACTCCCCTC 


CCCTCAACAA 


GATGTTTTGC 


CAACTGGCCA 


AGACCTGCCC 


TGTGCAGCWQ 


6 0 


KGGGWWOATT 


CCACACCCCC 


GCCCGGCACC 


CGCGTCCGCG 


CCATGGCCAT 


C T ACAAGCAG 


1 2 0 


TCACAGCACA 


TGACGGAGGW 


WGKGAGGCGC 


TGCCCCCACC 


ATGAGCGCYG 


CYCAGATAGC 


1 8 0 


S AY G 
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We claim: 

1. An array of oligonucleotide probes immobilized on a 
solid support, said array having at least 100 probes and no 
more than 100,000 different oligonucleotide probes 9 to 20 
nucleotides in length occupying separate known sites in said 
array, said oligonucleotide probes comprising at least four 
sets of probes: (1) a first set that is exactly complementary 
to a reference sequence and comprises probes that com- 
pletely span the reference sequence and, relative to the 
reference sequence, overlap one another in sequence; and (2) 
three additional sets of probes, each of which is identical to 
said first set of probes but for at least one different 
nucleotide, which different nucleotide is located in the same 
position in each of the three additional sets but which is a 
different nucleotide in each set. 

2. The array of claim 1, further comprising a fourth 
additional set of probes, which fourth additional set is 
identical to probes in the first set. 

3. The array of claim 1, wherein said reference sequence 
is a double-stranded nucleic acid and probes complementary 
to both strands of said reference are in said array. 

4. The array of claim 1, wherein said probes are 12 to 17 
: nucleotides in length. 

5. The array of claim 4, wherein said probes are 15 
nucleotides in length and attached by a covalent linkage to 
a site on a 3'-end of said probes, and said different nucleotide 
is located at position 7, relative to the 3'-end of said probes. 

6. The array of claim 1, wherein said reference sequence 
is exon 10 of a CFTR gene, and said array has between 1000 
and 100,000 oligonucleotide probes 10 to 18 nucleotides in 
length. 

7. The array of claim 6, wherein said array comprises a set 
of probes comprising a specific nucleotide sequence selected 
from the group of sequences consisting of: 
3-TTTATAXTAG (SEQ ID. NO:302); 
3*-TWAGXAGA (SEQ ID. NO:303); 
3-TATAGTXGAA (SEQ ID. NO:304); 
3*-XTAGTAXAAA (SEQ ID. NO:305); 
3-TAGTAGXAAC (SEQ ID. NO:306); 
3'-AGTAG AXACC (SEQ ID. NO:307); 
3-GTAGAAXCCA (SEQ ID. NO:308); 
3 f -TAGAAAXCAC (SEQ ID. NO:309); and 
3'-AGAAACXACA (SEQ ID. NO:310); wherein each set 

comprises 4 probes, and X is individually A, G, C, and T 
for each set. 

8. The array of claim 6, wherein said group of sequences 
consists of: 

3 f -TnXTAXTAGAAACC (SEQ ID. NO:9); 
3-TTATAGXAGAAACCA (SEQ ID. NO:10); 
3'-TATAGTXGAAACCAC (SEQ ID. NO:ll); 
3-ATAGTAXAAACCACA (SEQ ID. NO: 12); 
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3'-TAGTAGXAACCACAA (SEQ ID. NO:13); 
3-AGTAGAXACCACAAA (SEQ ID. No:14); 
3-GTAGAAXCCACAAAG (SEQ ID. NO:15); 
3'-TAGAAAXCACAAAGG (SEQ rD. NO: 16); and 
3-AGAAACXACAAAGGA (SEQ ID. NO:17); wherein 
each set comprises 4 probes, and X is individually A, G, 
C, and T for each set 

9. The array of claim 1, wherein said reference sequence 
is a sequence of a D-Ioop region of human mitochondrial 
DNA. 

10. The array of claim 9, wherein said probes are 15 
nucleotides in length, and said array comprises a first set of 
probes exactly complementary to a sequence contained in a 
sequence bounded by positions 16280 to 356 of the refer- 
ence sequence and four additional sets of probes identical to 
said first set but for position 7, relative to a 3'-end of a probe, 
which 3'-end is covalently attached to the substrate, where, 
for each of the four additional probe sets, a different nucle- 
otide is located, such that, for each probe in said first set, 
there is an identical probe in one of the four additional sets, 
and such that the array has between 2500 and 100,000 
oligonucleotide probes. 

11. The array of claim 1, wherein said reference sequence 
is a sequence from an exon of a human p53 gene. 

12. The array of claim 11, wherein said reference 
sequence comprises at least a 60 nucleotide contiguous 
sequence from exon 6 of a p53 gene. 

13. The array of claim 11, wherein said reference 
sequence is exon 5 of a p53 gene, said probes are 17 
nucleotides long, and said array comprises a first set of 
probes exactly complementary to said sequence and at least 
three additional sets of probes, each set comprising probes 
identical to said first set but for a nucleotide at position 7, 
relative to a 3'-end of a probe, which 3'-end is covalently 
attached to the substrate, which nucleotide is different from 
a nucleotide at this position in a corresponding probe of said 
first set. 

14. The array of claim 1, wherein said probes are oli- 
50 godeoxyribonucleotides. 

15. The array of claim 1, wherein said array has between 
10,000 and 100,000 probes. 

16. The array of claim 1, wherein the reference sequence 
is from a human immunodeficiency virus. 

17. The array of claim 16, wherein the reference sequence 
is from a reverse transcriptase gene of the human immuno- 
deficiency virus. 

18. The array of claim 1, wherein said probes are immo- 
bilized to said solid support via a linker. 
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